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^ (57) Abstract: The present invention relates to a novel method for analyzing nucleic acid sequences based on real-time detection of 
^ DNA polymerase-catalyzed incorporation of each of the four nucleotide bases, supplied individually and serially in a micronuidic 
O System ' to 8 reaction ce" containing a template system comprising a DNA fragment of unknown sequence and an oligonucleotide 

primer. Incorporation of a nucleotide base into the template system can be detected by any of a variety of methods including but 
O not limited to fluorescence and chemiluminescence detection. Alternatively, microcalorimetic detection of the heat generated by the 

incorporation of a nucleotide into the extending template system using thermopile, thermistor and refractive index measurements 
^ can be used to detect extension reactions. 



WO 03/020895 A2 lllinilllllilllllllM 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin- 
ning of each regular issue of the PCT Gazette. 



WO 03/020895 



1 



PCT/US02/27605 



METHOD OF DETERMINING THE NUCLEOTIDE SEQUENCE OF 
OLIGONUCLEOTIDES AND DNA MOLECULES 

SPECIFICATION 

1. INTRODUCTION 

5 The present invention relates to a novel method for analyzing nucleic 

acid sequences based on real-time detection of DNA polymerase-catalyzed 
incorporation of each of the four deoxynucleoside monophosphates, supplied 
individually and serially as deoxynucleoside triphosphates in a microfluidic system, to 
a template system comprising a DNA fragment of unknown sequence and an 

10 oligonucleotide primer. Incorporation of a deoxynucleoside monophosphate (dNMP) 
into the primer can be detected by any of a variety of methods including but not 
limited to fluorescence and chemiluminescence detection. Alternatively, 
microcalorimetic detection of the heat generated by the incorporation of a dNMP into 
the extending primer using thermopile, thermistor and refractive index measurements 

15 can be used to detect extension reactions. The present invention further provides a 
method for monitoring and correction of sequencing errors due to misincorporation or 
extension failure. 

The present invention provides a method for sequencing DNA that 
avoids electrophoretic separation of DNA fragments thus eliminating the problems 

20 associated with anomalous migration of DNA due to repeated base sequences or other 
self-complementary sequences which can cause single-stranded DNA to 
self-hybridize into hairpin loops, and also avoids current limitations on the size of 
fragments that can be read. The method of the invention can be utilized to determine 
the nucleotide sequence of genomic or cDNA fragments, or alternatively, as a 

25 diagnostic tool for sequencing patient derived DNA samples. 

2. BACKGROUND OF THE INVENTION 
Currently, two approaches are utilized for DNA sequence 
determination: the dideoxy chain termination method of Sanger (1977, Proc. Natl. 
Acad. Sci 74:5463-5674) and the chemical degradation method of Maxam (1977, 

30 Proc. Natl. Acad. Sci 74:560-564). The Sanger dideoxy chain termination method is 
the most widely used method and is the method upon which automated DNA 
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sequencing machines rely. In the chain termination method, DNA polymerase 
enzyme is added to four separate reaction systems to make multiple copies of a 
template DNA strand in which the growth process has been arrested at each 
occurrence of an A, in one set of reactions, and a G, C, or T, respectively, in the other 
5 sets of reactions, by incorporating in each reaction system one nucleotide type lacking 
the 3 f -OH on the deoxyribose at which chain extension occurs. This procedure 
produces a series of DNA fragments of different lengths, and it is the length of the 
extended DNA fragment that signals the position along the template strand at which 
each of four bases occur. To determine the nucleotide sequence, the DNA fragments 
10 are separated by high resolution gel electrophoresis and the order of the four bases is 
read from the gel. 

A major research goal is to derive the DNA sequence of the entire 
human genome. To meet this goal the need has developed for new genomic 
sequencing technology that can dispense with the difficulties of gel electrophoresis, 

15 lower the costs of performing sequencing reactions, including reagent costs, increase 
the speed and accuracy of sequencing, and increase the length of sequence that can be 
read in a single step. Potential improvements in sequencing speed may be provided 
by a commercialized capillary gel electrophoresis technique such as that described in 
Marshall and Pennisis (1998, Science 280:994-995). However, a major problem 

20 common to all gel electrophoresis approaches is the occurrence of DNA sequence 
compressions, usually arising from secondary structures in the DNA fragment, which 
result in anomalous migration of certain DNA fragments through the gel. 

As genomic information accumulates and the relationships between 
gene mutations and specific diseases are identified, there will be a growing need for 

25 diagnostic methods for identification of mutations. In contrast to the large scale 

methods needed for sequencing large segments of the human genome, what is needed 
for diagnostic methods are repetitive, low-cost, highly accurate techniques for 
resequencing of certain small isolated regions of the genome. In such instances, 
methods of sequencing based on gel electrophoresis readout become far too slow and 

30 expensive. 

When considering novel DNA sequencing techniques, the possibility 
of reading the sequence directly, much as the cell does, rather than indirectly as in the 
Sanger dideoxynucleotide approach, is a preferred goal. This was the goal of early 
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unsuccessful attempts to determine the shapes of the individual nucleotide bases with 
scanning probe microscopes. 

Additionally, another approach for reading a nucleotide sequence 
directly is to treat the DNA with an exonuclease coupled with a detection scheme for 
5 identifying each nucleotide sequentially released as described in Goodwin et al., 

(1995, Experimental Techniques of Physics 41 :279-294). However, researchers using 
this technology are confronted with the enormous problem of detecting and 
identifying single nucleotide molecules as they are digested from a single DNA 
strand. Simultaneous exonuclease digestion of multiple DNA strands to yield larger 
10 signals is not feasible because the enzymes rapidly get out of phase, so that 

nucleotides from different positions on the different strands are released together, and 
the sequences become unreadable. It would be highly beneficial if some means of 
external regulation of the exonuclease could be found so that multiple enzyme 
molecules could be compelled to operate in phase. However, external regulation of 
15 an enzyme that remains docked to its polymeric substrate is exceptionally difficult, if 
not impossible, because after each digestion the next substrate segment is immediately 
present at the active site. Thus, any controlling signal must be present at the active 
site at the start of each reaction. 

A variety of methods may be used to detect the polymerase-catalyzed 
20 incorporation of deoxynucleoside monophosphates (dNMPs) into a primer at each 
template site. For example, the pyrophosphate released whenever DNA polymerase 
adds one of the four dNTPs onto a primer 3' end may be detected using a 
chemiluminescent based detection of the pyrophosphate as described in Hyman E.D. 
(1988, Analytical Biochemistry 174:423-436) and U.S. Patent No. 4,971,903. This 
25 approach has been utilized most recently in a sequencing approach referred to as 
"sequencing by incorporation" as described in Ronaghi (1996, Analytical Biochem. 
242:84) and Ronaghi (1998, Science 281 :363-365). However, there exist two key 
problems associated with this approach, destruction of unincorporated nucleotides and 
detection of pyrophosphate. The solution to the first problem is to destroy the added, 
unincorporated nucleotides using a dNTP-digesting enzyme such as apyrase. The 
solution to the second is the detection of the pyrophosphate using ATP sulfurylase to 
reconvert the pyrophosphate to ATP which can be detected by a luciferase 
chemiluminescent reaction as described in U.S. Patent No. 4,971,903 and Ronaghi 
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(1998, Science 281 :363-365). Deoxyadenosine a- thiotriphosphate is used instead of 
dATP to minimize direct interaction of injected dATP with the luciferase. 

Unfortunately, the requirement for multiple enzyme reactions to be 
completed in each cycle imposes restrictions on the speed of this approach while the 
5 read length is limited by the impossibility of completely destroying unincorporated, 
non-complementary, nucleotides. If some residual amount of one nucleotide remains 
in the reaction system at the time when a fresh aliquot of a different nucleotide is 
added for the next extension reaction, there exists a possibility that some fraction of 
the primer strands will be extended by two or more nucleotides, the added nucleotide 

10 type and the residual impurity type, if these match the template sequence, and so this 
fraction of the primer strands will then be out of phase with the remainder. This out 
of phase component produces an erroneous incorporation signal which grows larger 
with each cycle and ultimately makes the sequence unreadable. 

A different direct sequencing approach uses dNTPs tagged at the 3 ' OH 

15 position with four different colored fluorescent tags, one for each of the four 
nucleotides is described in Metzger, MX., et al. (1994, Nucleic Acids Research 
22:4259-4267). In this approach, the primer/template duplex is contacted with all 
four dNTPs simultaneously. Incorporation of a 3' tagged NMP blocks further chain 
extension. The excess and unreacted dNTPs are flushed away and the incorporated 

20 nucleotide is identified by the color of the incorporated fluorescent tag. The 

fluorescent tag must then be removed in order for a subsequent incorporation reaction 
to occur. Similar to the pyrophosphate detection method, incomplete removal of a 
blocking fluorescent tag leaves some primer strands unextended on the next reaction 
cycle, and if these are subsequently unblocked in a later cycle, once again an 

25 out-of-phase signal is produced which grows larger with each cycle and ultimately 
limits the read length. To date, this method has so far been demonstrated to work for 
only a single base extension. Thus, this method is slow and is likely to be restricted to 
very short read lengths due to the fact that 99% efficiency in removal of the tag is 
required to read beyond 50 base pairs. Incomplete removal of the label results in out 

30 of phase extended DNA strands. 

3. SUMMARY OF THE INVENTION 

Accordingly, it is an object of the present invention to provide a novel 
method for determining the nucleotide sequence of a DNA fragment which 
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eliminates the need for electrophoretic separation of DNA fragments. The inventive 
method, referred to herein as "reactive sequencing", is based on detection of DNA 
polymerase catalyzed incorporation of each of the four nucleotide types, when 
deoxynucleoside triphosphates (dNTFs) are supplied individually and serially to a 
5 DNA primer/template system. The DNA primer/template system comprises a single 
stranded DNA fragment of unknown sequence, an oligonucleotide primer that forms a 
matched duplex with a short region of the single stranded DNA, and a DNA 
polymerase enzyme. The enzyme may either be already present in the template 
system, or may be supplied together with the dNTP solution. 

10 . Typically a single deoxynucleoside triphosphate (dNTP) is added to 

the DNA primer template system and allowed to react. As used herein 
deoxyribonucleotide means and includes, in addition to dGTP, dCTP, dATP, dTTP, 
chemically modified versions of these deoxyribonucleotides or analogs thereof. Such 
chemically modified deoxyribonucleotides include but are not limited to those 

15 deoxyribonucleotides tagged with a fluorescent or chemiluminescent moiety. 
Analogs of deoxyribonucleotides that may be used include but are not limited to 
7-deazapurine. The present invention additionally provides a method for improving 
the purity of deoxynucleotides used in the polymerase reaction. 

An extension reaction will occur only when the incoming dNTP base is 

20 complementary to the next unpaired base of the DNA template beyond the 3 1 end of 
the primer. While the reaction is occurring, or after a delay of sufficient duration to 
allow a reaction to occur, the system is tested to determine whether an additional 
nucleotide derived from the added dNTP has been incorporated into the DNA 
primer/template system. A correlation between the dNTP added to the reaction cell 

25 and detection of an incorporation signal identifies the nucleotide incorporated into the 
primer/template. The amplitude of the incorporation signal identifies the number of 
nucleotides incorporated, and thereby quantifies single base repeat lengths where 
these occur. By repeating this process with each of the four nucleotides individually, 
the sequence of the template can be directly read in the 5' to 3' direction one 

30 nucleotide at a time. 

Detection of the polymerase mediated extension reaction and 
quantification of the extent of reaction can occur by a variety of different techniques, 
including but not limited to, microcalorimetic detection of the heat generated by the 
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incorporation of a nucleotide into the extending duplex. Optical detection of an 
extension reaction by fluorescence or chemiluminescence may also be used to detect 
incorporation of nucleotides tagged with fluorescent or chemiluminescent entities into 
the extending duplex. Where the incorporated nucleotide is tagged with a 
5 fluorophore, excess unincorporated nucleotide is removed, and the template system is 
illuminated to stimulate fluorescence from the incorporated nucleotide. The 
fluorescent tag may then be cleaved and removed from the DNA template system 
before a subsequent incorporation cycle begins. A similar process is followed for 
chemiluminescent tags, with the chemiluminescent reaction being stimulated by 

10 introducing an appropriate reagent into the system, again after excess unreacted 
tagged dNTP has been removed; however, chemiluminescent tags are typically 
destroyed in the process of readout and so a separate cleavage and removal step 
following detection may not be required. For either type of tag, fluorescent or 
chemiluminescent, the tag may also be cleaved after incorporation and transported to 

15 a separate detection chamber for fluorescent or chemiluminescent detection. In this 
way, fluorescent quenching by adjacent fluorophore tags incorporated in a single base 
repeat sequence may be avoided. In addition, this may protect the DNA template 
system from possible radiation damage in the case of fluorescent detection or from 
possible chemical damage in the case of chemiluminescent detection. Alternatively 

20 the fluorescent tag may be selectively destroyed by a chemical or photochemical 
reaction. This process eliminates the need to cleave the tag after each readout, or to 
detach and transport the tag from the reaction chamber to a separate detection 
chamber for fluorescent detection. The present invention provides a method for 
selective destruction of a fluorescent tag by a photochemical reaction with 

25 diphenyliodonium ions or related species. 

The present invention further provides a reactive sequencing method 
that utilizes a two cycle system. An exonuclease-deficient polymerase is used in the 
first cycle and a mixture of exonuclease-deficient and exonuclease-proficient enzymes 
are used in the second cycle. In the first cycle, the template-primer system together 

30 with an exonuclease-deficient polymerase will be presented sequentially with each of 
the four possible nucleotides. In the second cycle, after identification of the correct 
nucleotide, a mixture of exonuclease proficient and deficient polymerases, or a 
polymerase containing both types of activity will be added in a second cycle together 
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with the correct dNTP identified in the first cycle to complete and proofread the 
primer extension. In this way, an exonuclease-proficient polymerase is only present 
in the reaction cell when the correct dNTP is present, so that exonucleolytic 
degradation of correctly extended strands does not occur, while degradation and 
5 correct re-extension of previously incorrectly extended strands does occur, thus 
achieving extremely accurate strand extension. 

The present invention also provides a method for monitoring reactive 
sequencing reactions to detect and correct sequencing reaction errors resulting from 
misincorporation, i.e., incorrectly incorporating a non-complementary base, and 

10 extension failure, i.e., failure to extend a fraction of the DNA primer strands. The 
method is based on the ability to (i) determine the size of the trailing strand 
population (trailing strands are those primer strands which have undergone an 
extension failure at any extension prior to the current reaction step); (ii) determine the 
downstream sequence of the trailing strand population between the 3' terminus of the 

15 trailing strands and the 3 f terminus of the corresponding leading strands 

("downstream" refers to the template sequence beyond the current 3' terminus of a 
primer strand; correspondingly, Upstream" refers to the known template and 
complementary primer sequence towards the 5' end of the primer strand; "leading 
strands" are those primer strands which have not previously undergone extension 

20 feilure); and (iii) predict at each extension step the signal to be expected from the 
extension of the trailing strands through simulation of the occurrence of an extension 
failure at any point upstream from the 3' terminus of the leading strand. Subtraction 
of the predicted signal from the measured signal yields a signal due only to valid 
extension of the leading strand population. 

25 In a preferred embodiment of the invention, the monitoring for reactive 

sequencing reaction errors is computer-aided. The ability to monitor extension 
failures permits determination of the point to which the trailing strands for a given 
template sequence have advanced and the sequence in the 1, 2 or 3 base gap between 
these strands and the leading strands. Knowing this information the dNTP probe 

30 cycle can be altered to selectively extend the trailing strands for a given template 
sequence while not extending the leading strands, thereby resynchronizing the 
populations. 
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The present invention further provides an apparatus for DNA 
sequencing comprising: (a) at least one chamber including a DNA primer/template 
system which produces a detectable signal when a DNA polymerase enzyme 
incorporates a deoxyribonucleotide monophosphate onto the 3 1 end of the primer 
5 strand; (b) means for introducing into, and evacuating from, the reaction chamber at 
least one selected from the group consisting of buffers, electrolytes, DNA template, 
DNA primer, deoxyribonucleotides, and polymerase enzymes; (c) means for 
amplifying said signal; and (d) means for converting said signal into an electrical 
signal. 

10 4. BRIEF DESCRIPTION OF TFTR DP AWTNCS 

Further objects and advantages of the invention will be apparent from a 
reading of the following description in conjunction with the accompanying drawings, 
in which: 

Figure 1 is a schematic diagram illustrating a reactive sequencing 
15 device containing a thin film bismuth antimony thermopile in accordance with the 
invention; 

Figure 2 is a schematic diagram of a reactive sequencing device 
containing a thermistor in accordance with the invention; 

Figure 3 is a schematic diagram illustrating a representative 
20 embodiment of microcalorimetry detection of a DNA polymerase reaction in 
accordance with the invention; 

Figure 4 is an electrophoretic gel showing a time course for primer 
extension assays catalyzed by T4 DNA polymerase mutants; 

Figure 5 is a schematic diagram illustrating a nucleotide attached to a 
25 fluorophore by a benzoin ester which is a photocleavable linker for use in the 
invention; 

Figure 6 is a schematic illustration of a nucleotide attached to a 
chemiluminescent tag for use in the invention; 

Figure 7 is a schematic diagram of a nucleotide attached to a 
30 chemiluminescent tag by a cleavable linkage; 

Figure 8(a) and 8(b) are schematic diagrams of a mechanical 
fluorescent sequencing method in accordance with the invention in which a DNA 
template and primer are absorbed on beads captured behind a porous frit; and 
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Figure 9 is a schematic diagram of a sequencing method in accordance 
with the invention utilizing a two cycle system. 

Figure 1 0 is a diagram of the mechanism of photochemical 
degradation of fluorescein by diphenyliodonium ion (DPI). 
5 Fig. 1 1 shows fluorescence spectra of equrmolar concentrations of 

fluorescein and tetramethylrhodamine dyes before and after addition of a solution of 
diphenyliodonium chloride. 

Figure 12 is the UV absorption spectra obtained from (1) fluorescein 
and (2) fluorescein + DPI after a single flash from a xenon camera strobe. 
10 Figure 13 displays the fluorescence spectra from single nucleotide 

polymerase reactions with DPI photobleaching between incorporation reactions. 

Figure 14A-D. Simulation of Reactive Sequencing of [CTGA] GAA 
ACC AGA AAG TCC [T], probed with a dNTP cycle. 14A. Sequence readout close 
to the primer where no extension failure has occurred. 14B. Sequence readout 
1 5 downstream of primer where 60% of the strands have undergone extension failure and 
are producing out of phase signals and misincorporation has prevented extension on 
75% of all strands. 14C. Downstream readout with error signals from trailing strands 
(dark shading) distinguished from correct readout signals from leading strands (light 
shading) using knowledge of the downstream sequence of the trailing strands. 14D. 
20 Corrected sequence readout following subtraction of error signals from trailing 
strands. Note the similarity to the data of Fig. 1 A. 

Figure 15. Effect of a leading strand population on extension signals. 

DETAILED DES CRIPTION OF THE PREFERRED EMBODIMENTS 

25 The present invention provides a method for determining the nucleic 

acid sequence of a DNA molecule based on detection of successive single nucleotide 
DNA polymerase mediated extension reactions. As described in detail below, in one 
embodiment, a DNA primer/template system comprising a polynucleotide primer 
complementary to and bound to a region of the DNA to be sequenced is constrained 

30 within a reaction cell into which buffer solutions containing various reagents 

necessary for a DNA polymerase reaction to occur are added. Into the reaction cell, a 
single type of deoxynucleoside triphosphate (dNTP) is added. Depending on the 
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identity of the next complementary site in the DNA primer/template system, an 
extension reaction will occur only when the appropriate nucleotide is present in the 
reaction cell. A correlation between the nucleotide present in the reaction cell and 
detection of an incorporation signal identifies the next nucleotide of the template. 
5 Following each extension reaction, the reaction cell is flushed with dNTP-free buffer, 
retaining the DNA primer/template system, and the cycle is repeated until the entire 
nucleotide sequence is identified. 

The present invention is based on the existence of a control signal 
within the active site of DNA polymerases which distinguish, with high fidelity, 
10 complementary and non-complementary fits of incoming deoxynucleotide 

triphosphates to the base on the template strand at the primer extension site, to 
read the sequence, and to incorporate at that site only the one type of deoxynucleotide 
that is complementary. That is, if the available nucleotide type is not complementary 
to the next template site, the polymerase is inactive, thus, the template sequence is the 

1 5 DNA polymerase control signal. Therefore, by contacting a DNA polymerase system 
with a single nucleotide type rather than all four, the next base in the sequence can be 
identified by detecting whether of not a reaction occurs. Further, single base repeat 
lengths can be quantified by quantifying the extent of reaction. 

As a first step in the practice of the inventive method, single-stranded 

20 template DNA to be sequenced is prepared using any of a variety of different methods 
known in the art. Two types of DNA can be used as templates in the sequencing 
reactions. Pure single-stranded DNA such as that obtained from recombinant 
bacteriophage can be used. The use of bacteriophage provides a method for 
producing large quantities of pure single stranded template. Alternatively, 

25 single-stranded DNA may be derived from double-stranded DNA that has been 
denatured by heat or alkaline conditions, as described in Chen and Subrung, (1985, 
DNA 4:165); Huttoi and Skaki (1986, Anal. Biochem. 152:232); and Mierendorf and 
Pfeflfer, (1987, Methods Enzymol. 152:556), may be used. Such double stranded 
DNA includes, for example, DNA samples derived from patients to be used in 

30 diagnostic sequencing reactions. 

The template DNA can be prepared by various techniques well known 
to those of skill in the art. For example, template DNA can be prepared as vector 
inserts using any conventional cloning methods, including those used frequently for 
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sequencing. Such methods can be found in Sambrook et aL, Molecular Cloning; A 
Laboratory Manual, Second Edition (Cold Spring Harbor Laboratories, New York, 
1989). In a preferred embodiment of the invention, polymerase chain reactions (PCR) 
may be used to amplify fragments of DNA to be used as template DNA as described 
5 in Innis et aL, ed. PCR Protocols (Academic Press, New York, 1990). 

The amount of DNA template needed for accurate detection of the 
polymerase reaction will depend on the detection technique used. For example, for 
optical detection, e.g., fluorescence or chemiluminescence detection, relatively small 
quantities of DNA in the femtomole range are needed. For thermal detection 

10 quantities approaching one picomole may be required to detect the change in 
temperature resulting from a DNA polymerase mediated extension reaction. 

In enzymatic sequencing reactions, the priming of DNA synthesis is 
achieved by the use of an oligonucleotide primer with a base sequence that is 
complementary to, and therefore capable of binding to, a specific region on the 

15 template DNA sequence. In instances where the template DNA is obtained as single 
stranded DNA from bacteriophage, or as double stranded DNA derived from 
plasmids, "universal" primers that are complementary to sequences in the vectors, 
i.e. t the bacteriophage, cosmid and plasmid vectors, and that flank the template DNA, 
can be used. 

20 Primer oligonucleotides are chosen to form highly stable duplexes that 

bind to the template DNA sequences and remain intact during any washing steps 
during the extension cycles. Preferably, the length of the primer oligonucleotide is 
from 18-30 nucleotides and contains a balanced base composition. The structure of 
the primer should also be analyzed to confirm that it does not contain regions of dyad 

25 symmetry which can fold and self anneal to form secondary structures thereby 

rendering the primers inefficient. Conditions for selecting appropriate hybridization 
conditions for binding of the oligonucleotide primers in the template systems will 
depend on the primer sequence and are well known to those of skill in the art. 

In utilizing the reactive sequencing method of the invention, a variety 

30 of different DNA polymerases may be used to incorporate dNTPs onto the 3 1 end of 
the primer which is hybridized to the template DNA molecule. Such DNA 
polymerases include but are not limited to Taq polymerase, T7 or T4 polymerase, and 
Klenow polymerase. In a preferred embodiment of the invention, described in detail 
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below, DNA polymerases lacking 5-3 -exonuclease proofreading activity are used in 
the sequencing reactions. For the most rapid reaction kinetics, the amount of 
polymerase is sufficient to ensure that each DNA molecule carries a non-covalently 
attached polymerase molecule during reaction. For a typical equilibrium constant of 
5 ~50 nM for the dissociation equilibrium: 

DNA-Pol ^ DNA + Pol K~50nM 

the desired condition is: [Pol] £ 50nM + [DNA]. 

In addition, reverse transcriptase which catalyzes the synthesis of 
10 single stranded DNA from an RNA template may be utilized in the reactive 

sequencing method of the invention to sequence messenger RNA (mRNA). Such a 
method comprises sequentially contacting an RNA template annealed to a primer 
(RNA primer/template) with dNTPs in the presence of reverse transcriptase enzyme 
to determine the sequence of the RNA. Because mRNA is produced by RNA 
15 polymerase-catalyzed synthesis from a DNA template, and thus contains the sequence 
information of the DNA template strand, sequencing the mRNA yields the sequence 
of the DNA gene from which it was transcribed. Eukaryotic mRNAs have poly(A) 

« 

tails and therefore the primer for reverse transcription can be an oligo(dT). Typically, 
it will be most convenient to synthesize the oligo(dT) primer with a terminal biotin or 

20 amino group through which the primer can be captured on a substrate and 
subsequently hybridize to and capture the template mRNA strand. 

The extension reactions are carried out in buffer solutions which 
contain the appropriate concentrations of salts, dNTPs and DNA polymerase required 
for the DNA polymerase mediated extension to proceed. For guidance regarding such 

25 conditions see, for example, Sambrook et al., (1989, Molecular Cloning, A 

Laboratory Manual, Cold Spring Harbor Press, N.Y.); and Ausubel et al. (1989, 
Current Protocols in Molecular Biology, Green Publishing Associates and Wiley 
Interscience, N.Y. ). 

Typically, buffer containing one of the four dNTPs is added into a 

30 reaction cell. Depending on the identity of the nucleoside base at the next unpaired 
template site in the primer/template system, a reaction will occur when the reaction 
cell contains the appropriate dNTP. When the reaction cell contains any one of the 
other three incorrect dNTPs, no reaction will take place. 
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The reaction cell is then flushed with dNTP free buffer and the cycle is 
repeated until a complete DNA sequence is identified. Detection of a DNA 
polymerase mediated extension can be made using any of the detection methods 
described in detail below including optical and thermal detection of an extension 
5 reaction. 

In some instances, a nucleotide solution is found to be contaminated 
with any of the other three nucleotides. In such instances a small fraction of strands 
may be extended by incorporation of an impurity dNTP when the dNTP type supplied 
is incorrect for extension, producing a population of strands which are subsequently 

10 extended ahead of the main strand population. Thus, in an embodiment of the 
invention, each nucleotide solution can be treated to remove any contaminated 
nucleotides. Treatment of each nucleotide solution involves reaction of the solution 
prior to use with immobilized DNA complementary to each the possibly 
contaminating nucleotides. For example, a dATP solution will be allowed to react 

15 with immobilized poly (dA), poly (dG) or poly (dC), with appropriate primers and 
polymerase, for a time sufficient to incorporate any contaminating dTTP, dCTP and 
dGTP nucleotides into DNA. 

In a preferred embodiment of the invention, the primer/template 
system comprises the template DNA tethered to a solid phase support to permit the 

20 sequential addition of sequencing reaction reagents without complicated and time 
consuming purification steps following each extension reaction. Preferably, the 
template DNA is covalently attached to a solid phase support, such as the surface of a 
reaction flow cell, a polymeric microsphere, filter material, or the like, which permits 
the sequential application of sequencing reaction reagents, buffers, dNTPs and 

25 DNA polymerase, without complicated and time consuming purification steps 
following each extension reaction. Alternatively, for applications that require 
sequencing of many samples containing the same vector template or same gene, for 
example, in diagnostic applications, a universal primer may be tethered to a support, 
and the template DNA allowed to hybridize to the immobilized primer. 

30 The DNA may be modified to facilitate covalent or non-covalent 

tethering of the DNA to a solid phase support. For example, when PCR is used to 
amplify DNA fragments, the 5 1 ends of one set of PCR primer oligonucleotides 
strands may be modified to carry a linker moiety for tethering one of the two 
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complementary types of DNA strands produced to a solid phase support. Such linker 
moieties include, for example, biotin. When using biotin, the biotinylated DNA 
fragments may be bound non-covalently to streptavidin covalently attached to the 
solid phase support. Alternatively, an amino group (-NH2) may be chemically 

5 incorporated into one of the PCR primer strands and used to covalently link the DNA 
template to a solid phase support using standard chemistry, such as reactions with 
N-hydroxysuccinimide activated agarose surfaces. 

In another embodiment, the 5' ends of the sequencing oligonucleotide 
primer may be modified with biotin, for non-covalent capture to a streptavidin-treated 

10 support, or with an amino group for chemical linkage to a solid support; the template 
strands are then captured by the non-covalent binding attraction between the 
immobilized primer base sequence and the complementary sequence on the template 
strands. Methods for immobilizing DNA on a solid phase support are well known to 
those of skill in the art and will vary depending on the solid phase support chosen. 

15 In the reactive sequencing method of the present invention, DNA 

polymerase is presented sequentially with each of the 4 dNTPs. In the majority of 
the reaction cycles, only incorrect dNTPs will be present, thereby increasing the 
likelihood of misincorporation of incorrect nucleotides into the extending DNA 
primer/ template system. 

20 Accordingly, the present invention further provides methods for 

optimizing the reactive sequencing reaction to achieve rapid and complete 
incorporation of the correct nucleotide into the DNA primer/template system, while 
limiting the misincorporation of incorrect nucleotides. For example, dNTP 
concentrations maybe lowered to reduce misincoiporation of incorrect nucleotides 

25 into the DNA primer. Kjq values for incorrect dNTPs can be as much as 1 000-fold 
higher than for correct nucleotides, indicating that a reduction in dNTP concentrations 
can reduce the rate of misincoiporation of nucleotides. Thus, in a preferred 
embodiment of the invention the concentration of dNTPs in the sequencing reactions 
are approximately 5-20 |oM. At this concentration, incorporation rates are as close 

30 to the maximum rate of 400 nucleotides/s for T4 DNA polymerase as possible. 

In addition, relatively short reaction times can be used to reduce the 
probability of misincoiporation. For an incorporation rate approaching the maximum 
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rate of ~ 400 nucleotides/s, a reaction time of approximately 25 milliseconds (ms) will 
be sufficient to ensure extension of 99.99% of primer strands. 

In a specific embodiment of the invention, DNA polymerases lacking 
3' to 5 1 exonuclease activity may be used for reactive sequencing to limit 
5 exonucleolytic degradation of primers that would occur in the absence of correct 
dNTPs. In the presence of all four dNTPs, misincorporation frequencies by DNA 
polymerases possessing exonucleolytic proofreading activity are as low as one error 

in 10*> to 10^ nucleotides incorporated as discussed in Echols and Goodman (1991, 
Annu. Rev. Biochem 60;477-51 1); and Goodman et al. (1993, Crit. Rev. Biochem. 
10 Molec. Biol. 28:83-126); and Loeb and Kunkel (1982, Annu. Rev. Biochem. 
52:429-457). In the absence of proofreading, DNA polymerase error rates are 

typically on the order of 1 in 10 4 to 1 in 10 6 . Although exonuclease activity increases 
the fidelity of a DNA polymerase, the use of DNA polymerases having proofreading 
activity can pose technical difficulties for the reactive sequencing method of the 

15 present invention. Not only will the exonuclease remove any misincorporated 
nucleotides, but also, in the absence of a correct dNTP complementary to the next 
template base, the exonuclease will remove correctly-paired nucleotides successively 
until a point on the template sequence is reached where the base is complementary to 
the dNTP in the reaction cell. At this point, an idling reaction is established where the 

20 polymerase repeatedly incorporates the correct dNMP and then removes it, Only 
when a correct dNTP is present will the rate of polymerase activity exceed the 
exonuclease rate so that an idling reaction is established that maintains the 
incorporation of that correct nucleotide at the 3 f end of the primer. 

A number of T4 DNA polymerase mutants containing specific amino 

25 acid substitutions possess reduced exonuclease activity levels up to 10,000-fold less 
than the wild-type enzyme. For example, Reha-Krantz and Nonay (1993, J. Biol 
Chem. 268:27100-17108) report that when Asp 1 12 was replaced with Ala and Glu 
114 was replaced with Ala (Dl 12A/E1 14A) in T4 polymerase, these two amino acid 
substitutions reduced the exonuclease activity on double stranded DNA by a factor of 

30 about 300 relative to the wild type enzyme. Such mutants may be advantageously 
used in the practice of the invention for incorporation of nucleotides into the DNA 
primer/template system. 
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la yet another embodiment of the invention, DNA polymerases which 
are more accurate than wild type polymerases at incorporating the correct nucleotide 
into a DNA primer/template may be used. For example, in a (Dl 12A/E1 14A) mutant 
T4 polymerase with a third mutation where lie 41 7 is replaced by Val 
5 (I417V/D1 12A/E114A), the I417V mutation results in an antimutator phenotype for 
the polymerase (Reha-Krantz and Nonay, 1994, J. Biol. Chem. 269:5635-5643; Stocki 
et al. t 1995, Mol. Biol. 254: 15-28). This antimutator phenotype arises because the 
polymerase tends to move the primer ends from the polymerase site to the 
exonuclease site more frequently and thus proof read more frequently than the wild 

10 type polymerase, and thus increases the accuracy of synthesis. 

In yet another embodiment of the invention, polymerase mutants that 
are capable of more efficiently incorporating fluorescent-labeled nucleotides into the 
template DNA system molecule may be used in the practice of the invention. The 
efficiency of incorporation of fluorescent-labeled nucleotides may be reduced due to 

1 5 the presence of bulky fluorophore labels that may inhibit dNTP interaction at the 
active site of the polymerase. Polymerase mutants that may be advantageously used 
for incorporation of fluorescent-labeled dNTPs into DNA include but are not limited 
to those described in U.S. Application Serial No. 08/632,742 filed April 16, 1996 
which is incorporated by reference herein. 

20 In a preferred embodiment of the invention, the reactive sequencing 

method utilizes a two cycle system. An exonuclease-deficient polymerase is used in 
the first cycle and a mixture of exonuclease-deficient and exonuclease-proficient 
enzymes are used in the second cycle, In the first cycle, the primer/template system 
together with an exonuclease-deficient polymerase will be presented sequentially with 

25 each of the four possible nucleotides. Reaction time and conditions will be such that a 
sufficient fraction of primers are extended to allow for detection and quantification of 
nucleotide incorporation, ~ 98%, for accurate quantification of multiple single-base 
repeats. In the second cycle, after identification of the correct nucleotide, a mixture of 
exonuclease proficient and deficient polymerases, or a polymerase containing both 

30 types of activity will be added in a second cycle together with the correct dNTP 
identified in the first cycle to complete and proofread the primer extension. In this 
way, an exonuclease-proficient polymerase is only present in the reaction cell when 
the correct dNTP is present, so that exonucleolytic degradation of correctly extended 
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strands does not occur, while degradation and correct re-extension of previously 
incorrectly extended strands does occur, thus achieving extremely accurate strand 
extension. 

The detection of a DNA polymerase mediated extension reaction can 
5 be accomplished in a number of ways. For example, the heat generated by the 
extension reaction can be measured using a variety of different techniques such as 
those employing thermopile, thermistor and refractive index measurements. 

In an embodiment of the invention, the heat generated by a DNA 
polymerase mediated extension reaction can be measured. For example, in a reaction 

10 cell volume of 1 00 micrometers 3 containing 1 |xg of water as the sole thermal mass 
and 2X10 1 1 DNA template molecules (300 finol) tethered within the cell, the 
temperature of the water increases by lxl0 3 °C for a polymerase reaction which 
extends the primer by a single nucleoside monophosphate. This calculation is based 
on the experimental determination that a one base pair extension in a DNA chain is an 

15 exothermic reaction and the enthalpy change associated with this reaction is 3.5 
kcal/mole of base. Thus extension of 300 finol of primer strands by a single base 
produces 300 finol x 3.5 kcal/mol or 1 x 10" 9 cal of heat. This is sufficient to raise 
the temperature of 1 |ng of water by lx 10~ 3 °C. Such a temperature change can be 
readily detectable using thermistors (sensitivity ^ 10- 4 °C); thermopiles (sensitivity 

20 £1 0 -5 °C); and refractive index measurements (sensitivity £ 1 0" 6 ° C). 

In a specific embodiment of the invention, thermopiles may used to 
detect temperature changes. Such thermopiles are known to have a high sensitivity to 
temperature and can make measurements in the tens of micro-degree range in several 
second time constants. Thermopiles may be fabricated by constructing serial sets of 

25 junctions of two dissimilar metals and physically arranging the junctions so that 
alternating junctions are separated in space. One set of junctions is maintained at a 
constant reference temperature, while the alternate set of junctions is located in the 
region whose temperature is to be sensed. A temperature difference between the two 
sets of junctions produces a potential difference across the junction set which is 

30 proportional to the temperature difference, to the thermoelectric coefficient of the 
junction and to the number of junctions. For optimum response, bimetallic pairs with 
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a large thermoelectric coefficient are desirable, such as bismuth and antimony. 
Thermopiles may be fabricated using thin film deposition techniques in which 
evaporated metal vapor is deposited onto insulating substrates through specially 
fabricated masks. Thermopiles that may be used in the practice of the invention 
5 include thermopiles such as those described in U.S. Patent 4,935,345, which is 
incorporated by reference herein. 

In a specific embodiment of the invention, miniature thin film 
thermopiles produced by metal evaporation techniques, such as those described in 
U.S. Patent 4,935,345 incorporated herein by reference, may be used to detect the 
10 enthalpy changes. Such devices have been made by vacuum evaporation through 
masks of about 10 mm square. Using methods of photolithography, sputter etching 
and reverse lift-off techniques, devices as small as 2 mm square may be constructed 
without the aid of modem microlithographic techniques. These devices contain 1 50 
thermoelectric junctions and employ 12 micron line widths and can measure the 
15 exothermic heat of reaction of enzyme-catalyzed reactions in flow streams where the 
enzyme is preferably immobilized on the surface of the thermopile. 

To incorporate thermopile detection technology into a reactive 
sequencing device, thin-film bismuth-antimony thermopiles 2, as shown in Figure 1, 
may be fabricated by successive electron-beam evaporation of bismuth and antimony 
20 metals through two different photohthographically-generated masks in order to 
produce a zigzag array of alternating thin bismuth and antimony wires which are 
connected to form two sets of bismuth-antimony thermocouple junctions. Modern 
microlithographic techniques will allow fabrication of devices at least one order of 
magnitude smaller than those previously made, i.e., with line widths as small as l^m 
25 and overall dimensions on the order of 100 jxm 2 . One set of junctions 4 (the sensor 
junctions) is located within the reaction cell 6, i.e., deposited on a wall of the reaction 
cell, while the second reference set of junctions 8 is located outside the cell at a 
reference point whose temperature is kept constant. Any difference in temperature 
between the sensor junctions and the reference junctions results in an electric potential 
30 being generated across the device, which can be measured by a high-resolution digital 
voltmeter 10 connected to measurement points 12 at either end of the device. It is not 
necessary that the temperature of the reaction cell and the reference junctions be the 
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same in the absence of a polymerase reaction event, only that a change in the 
temperature of the sensor junctions due to a polymerase reaction event be detectable 
as a change in the voltage generated across the thermopile. 

In addition to thermopiles, as shown in Figure 2, a thermistor 14 may 
5 also be used to detect temperature changes in the reaction cell 6 resulting from DNA 
polymerase mediated incorporation of dNMPs into the DNA primer strand. 
Thermistors are semiconductors composed of a sintered mixture of metallic oxides 
such as manganese, nickel, and cobalt oxides. This material has a large temperature 
coefficient of resistance, typically ~ 4% per °C, and so can sense extremely small 

1 0 temperature changes when the resistance is monitored with a stable, high-resolution 
resistance-measuring device such as a digital voltmeter, e.g., Keithley Instruments 
Model 2002. A thermistor 14, such as that depicted in Figure 2, may be fabricated in 
the reactive sequencing reaction cell by sputter depositing a thin film of the active 
thermistor material onto the surface of the reaction cell from a single target consisting 

1 5 of hot pressed nickel, cobalt and manganese oxides. Metal interconnections 1 6 which 
extend out beyond the wall of the reaction cell may also be fabricated in a separate 
step so that the resistance of the thermistor may be measured using an external 
measuring device 1 8. 

Temperature changes may also be sensed using a refractive index 
20 measurement technique. For example, techniques such as those described in Bornhop 
(1995, Applied Optics 34:3234-323) and U.S. Patent 5,325,170, may be used to detect 
refractive index changes for liquids in capillaries. In such a technique, a low-power 
He-Ne laser is aimed off-center at a right angle to a capillary and undergoes multiple 
internal reflection. Part of the beam travels through the liquid while the remainder 
25 reflects only off the external capillary wall. The two beams undergo different phase 
shifts depending on the refractive index difference between the liquid and capillary. 
The result is an interference pattern, with the fringe position extremely sensitive to 
temperature - induced refractive index changes. 

In a further embodiment of the invention, the thermal response of the 
30 system may be increased by the presence of inorganic pyrophosphatase enzyme which 
is contacted with the template system along with the dNTP solution. Additionally, 
heat is released as the pyrophosphate released from the dNTPs upon incorporation 
into the template system is hydrolyzed by inorganic pyrophosphatase enzyme. 
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In another embodiment, the pyrophosphate released upon incorporation 
of dNTFs may be removed from the template system and hydrolyzed, and the 
resultant heat detected, using thermopile, thermistor or refiactive index methods, in a 
separate reaction cell downstream. In this reaction cell, inorganic pyrophosphatase 
5 enzyme may be mixed in solution with the dNTP removed from the DNA template 
system, or alternatively the inorganic pyrophosphatase enzyme may be covalently 
tethered to the wall of the reaction cell. 

Alternatively, the polymerase-catalyzed incorporation of a nucleotide 
base can be detected using fluorescence and chemiluminescence detection schemes. 

10 The DNA polymerase mediated extension is detected when a fluorescent or 
chemuuminescent signal is generated upon incorporation of a fluorescently or 
chenmuminescently labeled dNMP into the extending DNA primer strand. Such tags 
are attached to the nucleotide in such a way as to not interfere with the action of the 
polymerase. For example, the tag may be attached to the nucleotide base by a linker 

15 arm sufficiently long to move the bulky fluorophore away from the active site of the 
enzyme. 

For use of such detection schemes, nucleotide bases are labeled by 
covalently attaching a compound such that a fluorescent or chenmuminescent signal 
is generated following incorporation of a dNTP into the extending DNA 
primer/template. Examples of fluorescent compounds for labeling dNTPs include but 
are not limited to fluorescein, rhodarnine, and BODIPY 

(4,4-^fluoro-4-bora-3a,4a-diaza-s-indacene). See Handbook of Molecular Probes 
and Fluorescent Chemicals available from Molecular Probes, Inc. (Eugene, OR). 
Examples of chemiluminescence based compounds that may be used in the 
sequencing methods of the invention include but are not limited to luminol and 
dioxetanones (See, Gunderman and McCapra, '•Chemiluminescence in Organic 
Chemistry", Springer-Verlag, Berlin Heidleberg, 1987) 

Fluorescently or chemiluminescently labeled dNTPs are added 
individually to a DNA template system containing template DNA annealed to the 
primer, DNA polymerase and the appropriate buffer conditions. After the reaction 
interval, the excess dNTP is removed and the system is probed to detect whether a 
fluorescent or chemiluminescent tagged nucleotide has been incorporated into the 
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DNA template. Detection fof the incorporated nucleotide can be accomplished using 
different methods that will depend on the type of tag utilized. 

For fluorescently-tagged dNTPs the DNA template system may be 
illuminated with optical radiation at a wavelength which is strongly absorbed by the 
5 tag entity. Fluorescence from the tag is detected using for example a photodetector 
together with an optical filter which excludes any scattered light at the excitation 
wavelength. 

Since labels on previously incorporated nucleotides would interfere 
with the signal generated by the most recently incorporated nucleotide, it is essential 

1 0 that the fluorescent tag be removed at the completion of each extension reaction. To 
facilitate removal of a fluorescent tag, the tag may be attached to the nucleotide via a 
chemically or photochemically cleavable linker using methods such as those 
described by Metzger, ML. et al. ( 1994, Nucleic Acids Research 22:4259-4267) and 
Burgess, K. et al, (1997, J. Org. Chem. 62:5165-5168) so that the fluorescent tag 

15 may be removed from the DNA template system before a new extension reaction is 
carried out. . 

In a further embodiment utilizing fluorescent detection, the fluorescent 
tag is attached to the dNTP by a photocleavable or chemically cleavable linker, and 
the tag is detached following the extension reaction and removed from the template 

20 system into a detection cell where the presence, and the amount, of the tag is 
determined by optical excitation at a suitable wavelength and detection of 
fluorescence. In this embodiment, the possibility of fluorescence quenching, due to 
the presence of multiple fluorescent tags immediately adjacent to one another on a 
primer strand which has been extended complementary to a single base repeat region 

25 in the template, is minimized, and the accuracy with which the repeat number can be 
determined is optimized. In addition, excitation of fluorescence in a separate chamber 
minimizes the possibility of photolytic damage to the DNA primer/template system. 

In an additional embodiment utilizing fluorescent detection, the signal 
from the fluorescent tag can be destroyed using a chemical reaction which specifically 

30 targets the fluorescent moiety and reacts to fonn a final product which is no longer 
fluorescent En this embodiment, the fluorescent tag attached to the nucleotide base is 
destroyed following extension and detection of the fluorescence signal, without the 
removal of the tag. In a specific embodiment, fluorophores attached to dNTP bases 
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may be selectively destroyed by reaction with compounds capable of extracting an 
electron from the excited state of the fluorescent moiety thereby producing a radical 
ion of the fluorescent moiety which then reacts to form a final product which is no 
longer fluorescent. In a further specific embodiment, the signal from a fluorescent tag 

5 is destroyed by photochemical reaction with the cation of a diphenyliodonium salt 
following extension and detection of the fluorescence label. The fluorescent tag 
attached to the incorporated nucleotide base is destroyed, without removal of the tag, 
by the addition of a solution of a diphenyliodonium salt to the reaction cell and 
subsequent UV light exposure. The diphenyliodonium salt solution is removed and 

10 the reactive sequencing is continued. This embodiment does not require dNTP's with 
chemically or photochemically cleavable linkers, since the fluorescent tag need not be 
removed. 

In a further embodiment of the technique, the response generated by a 
DNA polymerase-mediated extension reaction can be amplified. In this embodiment, 

15 the dNTP is chemically modified by the covalent attachment of a signaling tag 

through a linker that can be cleaved either chemically or photolytically. Following 
exposure of the dNTP to the primer/template system and flushing a^vay any 
unincorporated chemically modified dNTP, any signaling tag that has been 
incorporated is detached by a chemical or photolytic reaction and flushed out of the 

20 reaction chamber to an amplification chamber in which an amplified signal may be 
produced and detected. 

A variety of methods may be used to produce an amplified signal. In 
one such method the signaling tag has a catalytic function. When the catalytic tag is 
cleaved and allowed to react with its substrate, many cycles of chemical reaction 

25 ensue producing many moles of product per mole of catalytic tag, with a 

corresponding multiplication of reaction enthalpy. Either the reaction product is 
detected, through some property such as color or absorbency, or the amplified heat 
product is detected by a thermal sensor. For example, if an enzyme is covalently 
attached to the dNTP via a cleavable linker arm of sufficient length that the enzyme 

30 does not interfere with the active site of the polymerase enzyme. Following 

incorporation onto the DNA primer strand, that enzyme is detached and transported to 
a second reactor volume in which it is allowed to interact with its specific substrate, 
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thus an amplified response is obtained as each enzyme molecule carries out many 
cycles of reaction. For example, the enzyme catalase (CAT) catalyzes the reaction: 

CAT 

H2O2 H 2 0 + yz0 2 + ~100kJ/mol Heat 

5 if each dNTP is tagged with a catalase molecule which is detached after dNMP 
incorporation and allowed to react downstream with hydrogen peroxide, each 
nucleotide incorporation would generate ~ 25 kcal/mol x N of heat where N is the 
number of hydrogen peroxide molecules decomposed by the catalase. The heat of 
decomposition of hydrogen peroxide is already ~ 6-8 times greater than for nucleotide 

10 incorporation, (i.e. 3.5 - 4 kcal/mol). For decomposition of ~ 100 - 150 hydrogen 
peroxide molecules the amount of heat generated per base incorporation approaches 
1000 times that of the unamplified reaction. Similarly, enzymes which produce 
colored products, such as those commonly used in enzyme-linked immunosorbent 
assays (ELISA) could be incorporated as detachable tags. For example the enzyme 

15 alkaline phosphatase converts colorless p-nitrophenyl phosphate to a colored product 
(p-nitrophenol); the enzyme horseradish peroxidase converts colorless 
o-phenylenediamine hydrochloride to an orange product. Chemistries for linking 
these enzymes to proteins such as antibodies are well-known to those versed in the 
art, and could be adapted to link the enzymes to nucleotide bases via linker arms that 

20 maintain the enzymes at a distance from the active site of the polymerase enzymes. 

In a further embodiment, an amplified thermal signal may be produced 
when the signaling tag is an entity which can stimulate an active response in cells 
which are attached to, or held in the vicinity of, a thermal sensor such as a thermopile 
or thermistor. Pizziconi and Page (1997, Biosensors and Bioelectronics 12:457-466) 

25 reported that harvested and cultured mast cell populations could be activated by 

calcium ionophore to undergo exocytosis to release histamine, up to 10 - 30 pg (100 - 
300 finol) per cell. The multiple cell reactions leading to exocytosis are themselves 
exothermic. This process is further amplified using the enzymes diamine oxidase to 
oxidize the histamine to hydrogen peroxide and imidazoleacetaldehyde, and catalase 

30 to disproportionate the hydrogen peroxide. Two reactions together liberate over 1 00 
kJ of heat per mole of histamine. For example, a calcium ionophore is covalently 
attached to the dNTP base via a linker aim which distances the linked calcium 
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ionophore from the active site of the polymerase enzyme and is chemically or 
photochemically cleavable. Following the DNA polymerase catalyzed incorporation 
step, and flushing away unincorporated nucleotides any calcium ionophore remaining 
bound to an incorporated nucleotide may be cleaved and flushed downstream to a 
5 detection chamber containing a mast cell-based sensor such as described by Pizziconi 
and Page (1997, Biosensors and Bioelectronics 12:457-466). The calcium ionophore 
would bind to receptors on the mast cells stimulating histamine release with the 
accompanying generation of heat. The heat production could be further amplified by 
introducing the enzymes diamine oxidase to oxidize the histamine to hydrogen 
10 peroxide and imidazoleacetaldehyde, and catalase to disproportionate the hydrogen 
peroxide. Thus a significantly amplified heat signal would be produced which could 
readily be detected by a thermopile or thermistor sensor within, or in contact with, the 
reaction chamber. 

In a further embodiment utilizing chemiluminescent detection, the 
15 chemiluminescent tag is attached to the dNTP by a photocleavable or chemically 
cleavable linker. The tag is detached following the extension reaction and removed 
from the template system into a detection cell where the presence, and the amount, of 
the tag is determined by an appropriate chemical reaction and sensitive optical 

* * • 

detection of the light produced. In this embodiment, the possibility of a non-linear 
20 optical response due to the presence of multiple chemiluminescent tags immediately 
adjacent to one another on a primer strand which has been extended complementary 
to a single base repeat region in the template, is minimized, and the accuracy with 
which the repeat number can be determined is optimized. In addition, generation of 
chemiluminescence in a separate chamber minimizes chemical damage to the DNA 
25 primer/template system, and allows detection under harsh chemical conditions which 
otherwise would chemically damage the DNA primer/template. In this way, 
chemiluminescent tags can be chosen to optimize chemiluminescence reaction speed, 
or compatibility of the tagged dNTP with the polymerase enzyme, without regard to 
the compatibility of the chemiluminescence reaction conditions with the DNA 
30 primer/template. 

In a further embodiment of the invention, the concentration of the 
dNTP solution removed from the template system following each extension reaction 
can be measured by detecting a change in UV absorption due to a change in the 
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concentration of dNTPs, or a change in fluorescence response of fluorescently-tagged 
dNTPs. The incorporation of nucleotides into the extended template would result in 
a decreased concentration of nucleotides removed from the template system. Such a 
change could be detected by measuring the UV absorption of the buffer removed from 

5 the template system following each extension cycle. 

In a further embodiment of the invention, extension of the primer 
strand may be sensed by a device capable of sensing fluorescence from, or resolving 
an image of, a single DNA molecule. Devices capable of sensing fluorescence from 
a single molecule include the confocal microscope and the near-field optical 

10 microscope. Devices capable of resolving an image of a single molecule include the 
scanning tunneling microscope (STM) and the atomic force microscope (AFM). 

In this embodiment of the invention, a single DNA template molecule 
with attached primer is immobilized on a surface and viewed with an optical 
microscope or an STM or AFM before and after exposure to buffer solution 

15 containing a single type of dNTP, together with polymerase enzyme and other 

necessary electrolytes. When an optical microscope is used, the single molecule is 
exposed serially to fluorescently-tagged dNTP solutions and as before incorporation is 
sensed by detecting the fluorescent tag after excess unreacted dNTP is removed. 
Again, as before, the incorporated fluorescent tag must be cleaved and discarded 

20 before a subsequent tag can be detected. Using the STM or AFM, the change in 
length of the primer strand is imaged to detect incorporation of the dNTP. 
Alternatively the dNTP may be tagged with a physically bulky molecule, more readily 
visible in the STM or AFM., and this bulky tag is removed and discarded before each 
fresh incorporation reaction. 

25 When sequencing a single molecular template in this way, the 

possibility of incomplete reaction producing erroneous signal and out-of-phase strand 
extension, does not exist and the consequent limitations on read length do not apply. 
For a single molecular template, reaction either occurs or it does not, and if it does 
not, then extension either ceases and is known to cease, or correct extension occurs in 

30 a subsequent cycle with the correct dNTP. In the event that an incorrect nucleotide is 
incorporated, which has the same probability as more the multiple strand processes 
discussed earlier, for example 1 in 1,000, an error is recorded in the sequence, but this 
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error does not propagate or affect subsequent readout and so the read length is not 
limited by incorrect incorporation. 

5.1. DETECTION AND COMPENSATION FOR DNA POLYMERASE 
5 ERRORS 

In the reactive sequencing process, extension failures will typically 
arise due to the kinetics of the extension reaction and limitations on the amount of 
time allotted for each extension trial with the single deoxynucleotide triphosphates 

10 (dNTP's). When reaction is terminated by flushing away the dNTP supply, some 
small fraction of the primer strands may remain unextended. These strands on 
subsequent dNTP reaction cycles will continue to extend but will be out of phase with 
the majority strands, giving rise to small out-of-phase signals (i.e. signaling a positive 
incorporation for an added dNTP which is incorrect for extension of the majority 

15 strands). Because extension failure can occur, statistically, on any extension event, 
these out-of-phase signals will increase as the population of strands with extension 
failures grows. Ultimately the out-of-phase signal becomes comparable in amplitude 
with the signal due to correct extension of the majority strands and the sequence may 
be unreadable. The length by which the primer has been extended when the sequence 

20 becomes unreadable is known as the sequencing read length. 

The present invention relates to a method that can extend the 
sequencing read length in two ways, first, by discriminating between the in-phase and 
out-of-phase signals, and second by calculating where, and how, a dNTP probe 
sequence can be altered so as selectively to extend the out-of-phase strands to bring 

25 them back into phase with the majority strands. 

[0001] Specifically, a method is provided for discriminating between the in-phase 
and out-of-phase sequencing signals comprising: 

(i) detecting and measuring error signals thereby determining 
the size of the trailing strand population; 
30 (") between the 3' terminus of the trailing strand primers and 

the 3' terminus of the leading strand primers; 



WO 03/020895 



27 



PCT/US02/27605 



(iii) simulating the occurrence of an extension failure at a point 
upstream from the 3 1 terminus of the leading strands thereby 
predicting at each extension step the exact point in the 
sequence previously traversed by the leading strands to 

5 which the 3' termini of the trailing strands have been 

extended 

(iv) predicting for each dNTP introduced the signal to be 
expected from correct extension of the trailing strands; and 

(v) subtracting the predicted signal from the measured signal to 
10 yield a signal due only to correct extension of the leading 

strand population. 

(vi) 

"Upstream" refers to the known sequence of bases correctly 
incorporated onto the primer strands. "Downstream" refers to the sequence beyond 

15 the 3' terminus. Thus for the leading strand population the downstream sequence is 
unknown but is predetermined by the sequence of the template strand that has not yet 
been read; for the trailing strand population, the downstream sequence is known for 
the gap between the 3' termini of the trailing and leading strands. 

The gap between the leading and trailing primer strands may be 1, 2 or 

20 3 bases (where a single base repeat of any length, e.g. AAAA, is counted as a single 
base because the entire repeat will be traversed in a single reaction cycle if the correct 
dNTP is introduced), but can never exceed 3 bases nor shrink spontaneously to zero if 
the reaction cycle of the four dNTP's is unchanged and no other reaction errors occur, 
for example a second extension failure on the same primer strand If the reaction 

25 cycle of the four dNTP's is unchanged, it may readily be understood that a primer 
strand which has failed to extend when the correct dNTP, for example dATP, is in the 
reaction chamber cannot trail the leading (majority) strands (which did extend) by 
more than 3 bases, because the fourth base in the dNTP reaction cycle will always 
once again be the correct base (dATP) for the strand which foiled to extend 

30 previously. Similarly, a trailing strand resulting from an extension failure can never 
re-synchronize with the leading strands if extension subsequently proceeds correctly, 
because the leading strands will always have extended by at least one more nucleotide 
- G, T, or C in the example discussion of an A extension failure - before the trailing 
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strand can add the missing A. The effect is that after each complete dNTP cycle the 
trailing strands always follow the leading strands by an extension amount that 
represents the bases added in one complete dNTP cycle at a given point in the 
sequence. A further consequence is that all trailing strands that have undergone a 
5 single failure are in phase with each other regardless of the point at which the 
extension failure occurred. 

The methods described herein may be utilized to significantly extend 
the read length that can be achieved by the technique of reactive sequencing by 
providing a high level of immunity to erroneous signals arising from extension 
10 failure. In a preferred embodiment of the invention, the discrimination method of the 
invention is computer based. 

First, determination of the readout signals allows real-time 
discrimination between the signals due to correct extension of the leading strand 
population and error signals arising from extension of the population of trailing 

15 strands resulting from extension failure. Using this information, accurate sequence 
readout can be obtained significantly beyond the point at which the trailing strand 
signals would begin to mask the correct leading strand signals. In fact, because the 
trailing strand signals can always be distinguished from the leading strand signals, it is 
possible to allow the trailing strand population to continue to grow, at the expense of 

20 the leading strands, to the point where the sequence is read from the signals generated 
on the trailing strand population, and the leading strand signals are treated as error 
signals to be corrected for. Ultimately, as the probability that a primer strand will 
have undergone at least one extension failure approaches unity, the signals from the 
leading strand population will disappear. Correspondingly the probability will 

25 increase that a trailing strand will undergo a second extension failure; the signals from 
this second population of double failure strands can be monitored and the single 
feilure strand signals corrected in just the same way as the zero failure strand signals 
were corrected for signals due to single failure strands. 

Second, because knowledge of the leading strand sequence permits one 

30 to know the point to which the trailing strands have advanced, by simulating the effect 
of an extension failure on that known sequence in a computer, and also to know the 
sequence in the 1, 2 or 3 base gap between these strands and the leading strands, then 
for a given template sequence the dNTP probe cycle can be altered at any point to 



WO 03/020895 PCT/US02/27605 

29 

selectively extend the trailing strands while not extending the leading strands, thereby 
resynchronizing the populations. Alternatively the gap between leading and trailing 
strands can be simulated in the computer and the gap can be eliminated by reversing 
the dNTP cycle whenever the gap shrinks to a single base. These processes are 
5 referred to as "healing." If a large number of different sequences are being read in 
parallel with the same dNTP reagents, an altered dNTP probe cycle that is correct for 
healing extension failure strands on a given sequence may not be correct for healing 
other sequences. However, with a large enough number of parallel sequence 
readouts, roughly one-third of the sequences will have trailing strands with a 1-base 

10 gap at any point, and so reversal of the dNTP probe cycle at arbitrary intervals will 
heal roughly one-third of the readouts with extension failure gaps. Repeated arbitrary 
reversal of the dNTP probe cycle eventually heals roughly two-thirds of all the 
readouts. The overall effect of these error correction and error elimination processes 
is to reduce, or eliminate any limitation on read length arising from extension failure. 

15 The ability to overcome the read length limitations imposed by 

extension failure provides significant additional flexibility in experimental design. For 
example, it may be that read length is not limited by extension failure, but rather by 
misincorporation of incorrect nucleotides, which shuts down extension on the affected 
strands and steadily reduces the signal, ultimately to the point where it is not 

20 detectable with the desired accuracy. In this case, the ability to eliminate the effects 
of extension failure allows the experimenter great flexibility to alter the reaction 
conditions in such a way that misincorporation is minimized, at the expense of an 
increased incidence of extension failure. Misincorporation frequency depends in part 
on the concentration of the probing dNTP's and the reaction time allowed. Longer 

25 reaction times, or higher dNTP concentrations result in an increased probability of 
misincorporation, but a reduced incidence of extension failure. Therefore, if a higher 
level of extension failure can be tolerated due to, for example, the computer-aided 
signal discrimination and dNTP cycle-reversal healing methods, then reaction times 
and/or dNTP reagent concentrations can be reduced to minimize misincorporation, 

30 with the resulting increase in extension failure being countered by the computer-aided 
signal discrimination and/or dNTP cycle-reversal healing techniques described 
above. 



WO 03/020895 



30 



PCTAJS02/27605 



If the deoxyribonucleotides used for the polymerase reaction are 
impure a small fraction of strands will extend when the main nucleotide is incorrect 
and produce a population of leading, rather than trailing, error strands. As with the 
trailing strands, the leading strand population is never more than three bases, nor less 
5 than one base, ahead of the main population, unless a second error occurs on the same 
strand, and also, regardless of where an incorrect extension by an impurity dNTP 
occurs, the leading strands are all in phase with each other. A given base site can be 
probed either 1, 2 or 3 times with an incorrect dNTP before it must be extended by the 
correct dNTP, so on the average twice. If each of the incorrect dNTP's is assumed to 

1 0 carry the same percentage of dNTP impurity, then the probability of incorrect 

extension by, e.g. 99% pure dNTP containing the correct complementary base as an 
impurity is 1% + 3 (only 1/3 of the impurity will be the correct complementary base) 
x 2 (average 2 incorrect trials between each correct extension), that is, 0.67%. 

As with trailing strands, the leading strand population can produce out- 

15 of-phase extension signals that complicate the readout of the majority strand 
sequence, ak shown in Figure 15. Because the sequence downstream of the 3 1 
terminus of the majority strands is not known at the time of extension of those strands, 
the signal due to leading strand extension can not immediately be corrected for, nor 
can an altered dNTP cycle be calculated which would automatically heal the gap 

20 between majority and leading strands for a given template sequence. However similar 
methods can be used to ameliorate the effects of a leading strand population. First, as 
with trailing strands, reversal of the dNTP probe cycle automatically heals the gap 
between leading and majority strand populations whenever the gap shrinks to a single 
base, Therefore, aibitraiy reversal of the dNTP probe cycle has a 1/3 probability of 

25 healing the gap for a given sequence, or will heal 1/3 of the sequences in a large 
population of sequences probed in parallel. Continued arbitrary reversal eventually 
heals roughly two-thirds of such gaps. Second, although the sequence downstream of 
the 3' terminus of the majority strands is not immediately known, information about 
this sequence becomes available as soon as the majority strands traverse the gap 

30 region. Therefore, for each extension of the majority strands it is possible, ideally 
using a computer simulation, to calculate when the leading strand population would 
have traversed that base and thus the signal by which a prior extension of the majority 
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strands would have been contaminated. In this way the majority strand extension 
signals can retrospectively be corrected for leading strand signals. 

There are important aspects to leading strand creation that reduce the 
frequency of occurrence of leading strand events. First, if the concentration of 
5 impurity dNTP's is sufficiently low, a leading strand population cannot be created by 
impurity extension of the first base of a repeat. This is because the probability of 
incorrect incorporation of two impurity bases on the same strand in the same reaction 
cycle is the square of the probability for a single incorporation, and therefore 
vanishingly small for small impurity levels. Therefore, whenever the correct dNTP for 

10 extension of the repeat length is supplied, all strands will be extended to completion 
when the correct nucleotide is supplied, regardless of whether some fraction of the 
strands were already partially extended by one base of the repeat. Second, not all 
incorrect extensions result in a permanent phase difference. For a permanent phase 
difference to result, a second extension (by a correct base) must occur on the leading 

15 strand before the main strands extend to catch up to the leading strand. Labeling the 
next four sites along the template sequence: 1, 2, 3, 4, then, by definition, if a leading 
strand is created by incorporation of an impurity base on site 1 while the majority of 
the strands do not extend, the main nucleotide supplied is incorrect for extension at 
site 1 . If the main nucleotide supplied is correct for extension at site 2, a 2-base lead is 

20 created. There is 1 chance in 4 that the reaction chamber contains the correct 

nucleotide for site 2, so the probability of creating a 2-base extension in a single step 
(with an impurity extension followed by a correct extension) is 1/4 the probability of 
the impurity extension alone. For the 0.67% impurity extension probability cited 
above, this means a 0.16% probability of creating a 2-base extension in a single cycle. 

25 However, if the main nucleotide supplied is incorrect for further 

extension at site 2, and, by definition incorrect for extension at site 1, then for the lead 
to become fixed, the correct nucleotide for site 2 must be supplied before the correct 
nucleotide to extend at site 1. The probability that site 2 will extend before site 1 is 
therefore 50%; for a 0.67% impurity extension probability, the probability that this 

30 creates a fixed lead due to a second extension by a correct nucleotide is 0.33%. 

Overall, a 1% impurity level results in - 0.5% probability of creating a leading strand 
in any given reaction trial. 
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Preparation of specific embodiments in accordance with the present 
invention will now be described in further detail. These examples are intended to be 
illustrative and the invention is not limited to the specific materials and methods set 
forth in these embodiments. 
5 Example 1 

A microcalorimetic experiment was performed which demonstrates for 
the first time the successful thermal detection of a DNA polymerase reaction. The 
results are shown in Figure 3. Approximately 20 units of T7 Sequenase was injected 
into a 3mL reaction volume containing approximately 20nmol of DNA template and 

10 complementary primer, and an excess of dNTPs. The primer was extended by 

52-base pairs, the expected length given the size of the template. Using a commercial 
microcalorimeter (TAM Model 2273; Thermometries, Sweden) a reaction enthalpy of 
3.5-4 kcal per mole of base was measured (Figure 3). This measurement is well 
within the value required for thermal detection of DNA polymerase activity. This 

15 measurement also demonstrates the sensitivity of thermopile detection as the 

maximum temperature rise in the reaction cell was IxlO- 3 C. The lower trace seen in 
Figure 3 is from a reference cell showing the injection artifact for an enzyme-free 
injection into buffer containing no template system. 

20 Example 2 

To illustrate the utility of mutant T4 polymerases, two primer 
extension assays were performed with two different mutant T4 polymerases, both of 
which are exonuclease deficient. In one mutant, Aspl 12 is replaced with Ala and 
Glul 14 is replaced with Ala (Dl 12A/E1 14A). The exonuclease activity of this 

25 mutant on double-stranded DNA is reduced by a factor of about 300 relative to the 
wild type enzyme as described by Reha-Krantz and Nonay (1993, J. Biol. Chem. 
268:27100-27108). In a second polymerase mutant, in addition to the Dl 12A/E1 14A 
amino acid substitutions, a third substitution replaces De417 with Val 
(I417V/D112A/E1 14A). The 14 17V mutation increases the accuracy of synthesis by 

30 this polymerase (Stocki, S. A. and Reha-Krantz, L. J, 1 995, J Mol. Biol. 

245:15-28;Reha-Krantz, L. J. and Nonay, R.L., 1994, J. Biol. Chem. 269:5635-5643) 
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Two separate primer extension reactions were carried out using each of 
the polymerase mutants. In the first, only a single correct nucleotide, dGTP, 
corresponding to a template C was added. The next unpaired template site is a G so 
that misincorporation would result in formation of a G # G mispair. A G«G mispair 

5 tends to be among the most difficult mispairs for polymerases to make. In the second 
primer extension reaction, two nucleotides, dGTP and dCTP, complementary to the 
first three unpaired template sites were added. Following correct incorporation of 
dGMP and dCMP, the next available template site is a T. Formation of OT mispairs 
tend to be very difficult while G*T mispairs tend to be the most frequent mispairs 

10 made by polymerases. 

Time courses for primer extension reactions by both mutant T4 
polymerases are shown in Figure 4. Low concentrations of T4 polymerase relative to 
primer/template (p/t) were used so that incorporation reactions could be measured on 
convenient time scales (60 min). By 64 minutes 98% of the primers were extended. 

15 In reactions containing only dGTP, both polymerases nearly completely extended 
primer ends by dGMP without any detectable incorporation of dGMP opposite G. In 
reactions containing both dGMP and dCMP, both polymerases nearly completely 
extended primer ends by addition of one dGMP and two dCMFs. A small percentage 
(«1%) of misincorporation was detectable in the reaction catalyzed by the 

20 Dl 12A/E1 14A mutant. Significantly, no detectable misincorporation was seen in the 
reaction catalyzed by the 14 V7V/D1 12A/E1 14A mutant. 



Example 3 

In accordance with the invention a fluorescent tag may be attached to 
25 the nucleotide base at a site other than the 3 f position of the sugar moiety. 
Chemistries for such tags which do not interfere with the activity of the DNA 
polymerase have been developed as described by Goodwin et al. (1995, Experimental 
Technique of Physics 41 :279-294). Generally the tag is attached to the base by a 
linker aim of sufficient length to move the bulky tag out of the active site of the 
30 enzyme during incorporation. 

As illustrated in Figure 5, a nucleotide can be connected to a 
fluorophore by a photocleavable linker, e.g. 9 a benzoin ester. After the tagged dNMP 
is incorporated onto the 3* end of the DNA primer strand, the DNA template system is 
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illuminated by light at a wave length corresponding to the absorption maximum of the 
fluorophore and the presence of the fluorophore is signaled by detection of 
fluorescence at the emission maximum of the fluorophore. Following detection of the 
fluorophore, the linker may be photocleaved to produce compound 2; the result is an 
5 elongated DNA molecule with a modified but non-fluorescent nucleotide attached. 
Many fluorophores, including for example, a dansyl group or acridine, etc., will be 
employed in the methodology illustrated by Figure 5. 

Alternatively, the DNA template system is not illuminated to stimulate 
fluorescence. Instead, the photocleavage reaction is earned out to produce compound 
10 2 releasing the fluorophore, which is removed from the template system into a 
separate detection chamber. There the presence of the fluorophore is detected as 
before, by illumination at the absorption maximum of the fluorophore and detection 
of emission near the emission maximum of the fluorophore. 

15 Example 4 

In a specific embodiment of the invention, a linked system consisting 
of a chemiluminescently tagged dNTP can consist of a chemiluminescent group (the 
dioxetane portion of compound 4), a chemically cleavable linker (the silyl ether), and 
an optional photocleavable group (the-benzoin ester) as depicted in Figure 6. The 

20 cleavage of the silyl ether by a fluoride ion produces detectable chemiluminescence as 
described in Schaap et al. (1991, "Chemical and Enzymatic Triggering of 1, 
2-dioxetanes: Structural Effects on Chemiluminescence Efficiency" in 
Bioluminescence & Chemiluminescence, Stanley, P.E. and Knicha, L.J. (Eds), Wiley, 
N.Y. 1991, pp. 103-106). In addition, the benzoin ester that links the nucleoside 

25 triphosphate to the silyl linker is photocleavable as set forth in Rock and Chan (1996, 
J. Org. Chem. 61 : 1526-1529); and Felder, et al (1997, First International Electronic 
Conference on Synthetic Organic Chemistry, Sept 1-30). Having both a 
chemiluminescent tag and a photocleavable linker is not always necessary, the silyl 
ether can be attached directly to the nucleotide base and the chemiluminescent tag is 

30 destroyed as it is read 

As illustrated in Figure 6 with respect to compound 3, treatment with 
fluoride ion liberates the phenolate ion of the adamantyl dioxetane, which is known to 
chemiluminesce with high efficiency (Bronstein et al., 1991, "Novel 



WO 03/020895 



35 



PCTYUS02/27605 



Chemiluminescent Adamantyl 1, 2-dioxetane Enzyme Substrates," in 
Bioluminescence & Chemiluminescence, Stanley, P.E. and Kricka, RJ. (eds), Wiley, 
N.Y. 1991 pp. 73-82). The other product of the reaction is compound 4, which is no 
longer chemiluminescent. Compound 4 upon photolysis at 308-366 nm liberates 
5 compound 2. 

The synthesis of compound 1 is achieved by attachment of the 
fluorophore to the carboxyl group of the benzoin, whose a- keto hydroxyl group is 
protected by 9-fluorenylmethoxycarbonyl (FMOC), followed by removal of the 
FMOC protecting group and coupling to the nucleotide bearing an activated carbonic 

10 acid derivative at its 3' end. Compound 4 is prepared via coupling of the vinyl ether 
form of the adamantyl phenol, to chloro(3-cyanopropyl)dimethylsilane, reduction of 
the cyano group to the amine, generation of the oxetane, and coupling of this 
chemiluminescence precursor to the nucleotide bearing an activated carbonic acid 
derivative at its 3 ? end. 

15 The chemiluminescent tag can also be attached to the dNTP by a 

cleavable linkage and cleaved prior to detection of chemiluminescence. As shown in 
Figure 7, the benzoin ester linkage in compound 3 may be cleaved photolytically to 
produce the free chemiluminescent compound 5. Reaction of compound 5 with 
fluoride ion to generate chemiluminescence may then be carried out after compound 5 

20 has been flushed away from the DNA template primer in the reaction chamber. As an 
alternative to photolytic cleavage, the tag may be attached by a chemically cleavable 
linker which is cleaved by chemical processing which does not trigger the 
chemiluminescent reaction. 

25 Example 5 

In this example, the nucleotide sequence of a template molecule 
comprising a portion of DNA of unknown sequence is determined. The DNA of 
unknown sequence is cloned into a single stranded vector such as Ml 3. A primer that 
is complementary to a single stranded region of the vector immediately upstream of 
30 the foreign DNA is annealed to the vector and used to prime synthesis in reactive 
sequencing. For the annealing reaction, equal molar ratios of primer and template 
(calculated based on the approximation that one base contributes 330 g/mol to the 
molecular weight of a DNA polymer) is mixed in a buffer consisting of 67 mM 
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TrisHCl pH 8.8, 16.7 mM (NH^SO^ and 0.5 mM EDTA. This buffer is suitable 

both for annealing DNA and subsequent polymerase extension reactions. Annealing 
is accomplished by heating the DNA sample in buffer to 80°C and allowing it to 
slowly cool to room temperature. Samples are briefly spun in a microcentrifuge to 
5 remove condensation from the lid and walls of the tube. To the DNA is added 0.2 
mol equivalents of T4 polymerase mutant I417V/D1 12A/E1 14A and buffer 
components so that the final reaction cell contains 67 mM TrisHCl pH 8.8, 16.7 mM 
(NH4>2S04, 6.7 mM MgCl2 and 0.5 mM dithiothreitol. The polymerase is then 

queried with one dNTP at a time at a final concentration of 10JM. The nucleotide is 
10 incubated with polymerase at 37°C for 10s. Incorporation of dNTPs may be detected 
by one of the methods described above including measuring fluorescence, 
chemiluminescence or temperature change. The reaction cycle will be repeated with 
each of the four dNTPs until the complete sequence of the DNA molecule has been 
determined. 

15 

Example 6 

Figure 7 illustrates a mechanical fluorescent sequencing method in 
accordance with the invention. A DNA template and primer are captured onto beads 
18 using, for example, avidin-biotin or -NH2/n-hydroxysuccinimide chemistry and 

20 loaded behind a porous frit or filter 20 at the tip of a micropipette 22 or other 

aspiration device as shown in Figure 7(a), step 1. Exonuclease deficient polymerase 
enzyme is added and the pipette tip is lowered into a small reservoir 24 containing a 
solution of fluorescently-labeled dNTP. As illustrated in step 2 of Figure 7(a), a small 
quantity of dNTP solution is aspirated through the filter and allowed to react with the 

25 immobilized DNA. The dNTP solution also contains approximately 100 nM 

polymerase enzyme, sufficient to replenish rinsing losses. After reaction, as shown in 
step 3, the excess dNTP solution 24 is forced back out through the frit 20 into the 
dNTP reservoir 24. In step 4 of the process the pipette is moved to a reservoir 
containing buffer solution and several aliquots of buffer solution are aspirated through 

30 the frit to rinse excess unbound dNTP from the beads. The buffer inside the pipette is 
then forced out and discarded to waste 26. The pipette is moved to a second buffer 
reservoir (buffer 2), containing the chemicals required to cleave the fluorescent tag 
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from the incorporated dNMP. The reaction is allowed to occur to cleave the tag. As 
shown in step 5 the bead/buffer slurry with the detached fluorescent tag in solution is 
irradiated by a laser or light source 28 at a wavelength chosen to excite the fluorescent 
tag, the fluorescence is detected by fluorescence detector 30 and quantified if 

5 incorporation has occurred. 

Subsequent steps depend on the enzyme strategy used. If a single-stage 
strategy with an exonuclease-deficient polymerase is used, as illustrated in Figure 
7(b), the solution containing the detached fluorescent tag is discarded to waste (step 6) 
which is expelled, followed by a further rinse step with buffer 1 (step 7) which is 

10 thereafter discarded (step 8) and the pipette is moved to a second reservoir containing 
a different dNTP (step 9) and the process repeats starting from step 3, cycling through 
all four dNTPs. 

In a two-stage strategy, after the correct dNTP has been identified and 
the repeat length quantified in step 5, the reaction mixture is rinsed as shown in steps 

15 6, 7, and 8 of Figure 7(b) and the pipette is returned to a different reservoir containing 
the same dNTP (e.g., dNTPl) as shown in step (a) of Figure 8 to which a quantity of 
exonuclease-proficient polymerase has been added and the solution is aspirated for a 
further stage of reaction which proof-reads the prior extension and correctly 
completes the extension. This second batch of dNTP need not be fluorescently 

20 tagged, as the identity of the dNTP is known and no sequence information will be 
gained in this proof-reading step. If a tagged dNTP is used, the fluorescent tag is 
preferably cleaved and discarded as in step 5 of Figure 7(a) using Buffer 2. 
Alternatively, the initial incorporation reaction shown in step 2 of Figure 7(a) is 
carried out for long enough, and the initial polymerase is accurate enough, so that the 

25 additional amount of fluorescent tag incorporated with dNTPl at step a of Figure 8 is 
small and does not interfere with quantification of the subsequent dNTP. Following 
proof-reading in step a of Figure 8, excess dNTP is expelled (step b) and the reaction 
mixture is rinsed (steps c, d) with a high-salt buffer to dissociate the exo+ polymerase 
from the DNA primer/template. It is important not to have exonuclease-proficient 

30 enzyme present if the DNA primei/template is exposed to an incorrect dNTP. The 
pipette is then moved to step e, in which the reservoir contains a different dNTP, and 
the process is repeated, again cycling through all four dNTPs. 
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Example 7 

A new process for destruction of a fluorophore signal which involves 
reaction of the electronically excited fluorophore with an electron-abstracting species, 
such as diphenyliodonium salts, is described. 
5 The reaction of a diphenyliodonium ion with an electronically excited 

fluorescein molecule is illustrated in Figure 10. The diphenyliodonium ion extracts 
an electron from the excited state of the fluorescein molecule producing a radical ion 
of the fluorescein molecule and a neutral diphenyliodonium free radical. The 
diphenyliodonium free radical rapidly decomposes to iodobenzene and a phenyl 
10 radical. The fluorescein radical ion then either reacts with the phenyl radical or 
undergoes an internal arrangement to produce a final product which is no longer 
fluorescent. 

Figures 1 1 and 12 demonstrate evidence for the specific destruction of 
fluorescein by diphenylionium ion. In Figure 1 1, fluorescence spectra are presented 
15 for a mixture of fluorescein and tetramethylrhodamine dyes, before and after addition 
of a solution of diphenyliodonium chloride. It is seen that the fluorescence from the 

■ 

fluorescein dye is immediately quenched, demonstrating electron abstraction from the 
excited state of the molecule while the fluorescence from the rhodamine is unaffected, 
apart from a small decrease due to the dilution of the dye solution by the added 

20 diphenyliodonium chloride solution. 

Elimination of the fluorescent signal from the fluorescein dye by 
diphenyliodonium chloride is not in itself proof that the fluorescein molecule has been 
destroyed, because electron abstraction from the excited state of fluorescein 
effectively quenches the fluorescence, and quenching need not result in destruction of 

25 the fluorescein molecule. However, Figure 12 demonstrates that the fluorescein 
molecule is destroyed by reaction with the diphenyliodonium and not simply 
quenched. Figure 12 demonstrates the ultraviolet (UV) absorption spectra for a 
fluorescein solution before and after addition of a solution of diphenyliodonium 
chloride. Spectrum 1 is the UV absorption spectrum of a pure fluorescein solution. 

30 Spectrum 2 is the UV absorption of the fluorescein solution following the addition of 
a solution containing a molar excess of diphenyliodonium (DPI) chloride and 
exposure to a single flash from a xenon camera strobe. The data show that fluorescein 
is essentially destroyed by the photochemical reaction with the DPI ion. Figure 12 
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provides clear evidence that diphenyliodonium chloride not only quenches the 
fluorescence from the fluorescein dye but destroys the molecule to such an extent that 
it can no longer act as a fluorophore. 

An experiment was performed to demonstrate efficient fluorescent 
5 detection and destruction of fluorophore using a template sequence. The template, 
synthesized with a alkylamino linker at the 5' terminus, was: 

3'-H 2 N-(CH2)7-GAC CAT TAT AGG TCT TGT T AG GGA AAG GA A OA -T 
The trial sequence to be determined is: G GGA AAG GAA GA. 
A tetramethyrhodamine-labeled primer sequence was synthesized to be 
10 complementary to the template as follows: 

5'-[Rhodamine]-<CH2)6-CTG GTA ATA TCC AGA ACA AT-3 1 

The alkylamino-terminated template molecules were chemically linked 
to Sepharose beads derivatized with N-hydroxysuccinimide and the rhodamine- 
labeled primer was annealed to the template. The beads with attached DNA template 

1 5 and annealed primer were loaded behind a B- 1 00 disposable filter in a 5-ml syringe. 
A volume containing a mixture of fluorescein-labeled and unlabelled dCTP in a ratio 
of 1 :2 and exonuclease-deficient polymerase enzyme in a reaction buffer as specified 
by the manufacturer was drawn into the syringe. Reaction was allowed to proceed for 
20 minutes, at 35°C. After the reaction, the fluid was forced out of the syringe, 

20 retaining the beads with the reacted DNA behind the filter, and three washes with 
double-distilled water were performed by drawing water through the filter into the 
syringe and expelling it. The beads were resuspended in phosphate buffer, the filter 
was removed and the suspension was dispensed into a cuvette for fluorescence 
analysis. Following fluorescence analysis, the bead suspension was loaded back into 

25 the syringe which was then fitted with a filter tip, and the phosphate buffer was 
dispensed. A solution of DPI was drawn up into the syringe with a concentration 
calculated to be in 1 : 1 molar equivalence to the theoretical amount of DNA template, 
the filter was removed and the bead suspension was dispensed into a cuvette for UV 
light exposure for 15 minutes. The suspension was recollected into a syringe, the 

30 filter was reattached, the DPI solution was expelled, and the beads were resuspended 
by drawing up 0.7 mL of phosphate buffer. After removal of the filter the bead 
suspension was dispensed into a clean cuvette for fluorescence analysis to check the 
completeness of destruction of the fluorescein by the reaction with the DPI. A 
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subsequent polymerase reaction was performed using the same protocol with labeled 
dTTP and similarly measured for fluorescence. 

Figure 13 demonstrates the results of the polymerase reactions, with 
photochemical destruction of the fluorescein label by DPI following each nucleotide 
5 incorporation reaction. Curve 1 shows rhodamine fluorescence following annealing 
of the rhodamine labeled primer to the beads, demonstrating covalent attachment of 
the template strands to the beads and capture of the rhodamine-labeled primer strands. 
Curve 2 demonstrates detection of fluorescein following polymerase-catalyzed 
incorporation of three partially fluorescein-labeled dCMPs onto the 3' terminus of the 

10 primer strands. Curve 3 shows complete destruction of the incorporated fluorescein 
label by photo-induced reaction with diphenyliodonium chloride. Loss of rhodamine 
signal here is attributed to loss of a significant fraction of the beads which stuck to the 
filter during washes. Curve 4 shows detection of a new fluorescein label following 
photochemical destruction of the fluorescein attached to the dCMP's and subsequent 

15 polymerase-catalyzed incorporation of three partially fluorescein-labeled dTMPs onto 
the 3 ! terminus of the primer strands. 

The following methods were utilized to demonstrate successful 
destruction of a fluorescein-labeled dTMP. 

Sepharose beads were purchased from Amersham with surfaces 
20 derivatized with N-hydroxysuccinimide for reaction with primary amine groups. The 
alkylamino-terminated templates were chemically linked to the Sepharose beads using 
the standard procedure recommended by the manufacturer. 

The beads with attached template were suspended in 250 mM Tris 
buffer containing 250 mM NaCl and 40 nM MgCl 2 . The solution containing the 
25 primer strands was added and the mixture heated to 80°C and cooled over - 2 hours to 
anneal the primers to the surface-immobilized DNA template strands. 

Fluorescein-labeled dUTP and dCTP were purchased from NEN Life 
Science Products. Unlabeled dTTP and dCTP were purchased from Amersham. 

Prior to any reaction, the annealed primer/template was subjected to 
30 fluorescence analysis to ensure that annealing had occurred. The excitation 

wavelength used was 320 nm and fluorescence from fluorescein and rhodamine was 
detected at -520 nm and ~580 nm respectively. 
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Reagent volumes were calculated on the assumption that the DNA 
template was attached to the beads with 100% efficiency. 

The 5X reaction buffer contained: 

1) 250 mM Tris buffer, pH 7.5 

2) 250 mM NaCl 

3) 40 mM MgCl 2 

4) 1 mg/mL BSA 

5) 25 mM dithiothreitol (DTI) 

mixed and brought to volume with double-distilled H 2 0 



T4 DNA polymerase was obtained from Worthington Biochemical 
Corp. The polymerase was dissolved in the polymerase buffer according to the 
manufacturer's protocols. 

Fluorescein-labeled and unlabeled dCTP's were mixed in a ratio of 1 :2. 
15 The reaction was run in a 5 mL syringe (Becton Dickinson) fitted with 

a B-100 disposable filter (Upchurch Scientific). This limits the reaction volume to 5 
mL total: 

Primer template suspension 0.7 mL 

T4 DNA Polymerase 1 .0 mL 

20 FdCTP/dCTP 0.040 mL 

5X reaction buffer 2.0 mL 

double-dist. H 2 0 1 .0 mL 



The reaction was allowed to proceed in a 35°C oven for 20 minutes. 

25 Following reaction, the fluid was forced out of the syringe allowing the filter to retain 
the beads with the reacted DNA Three washes with double-distilled water were 
performed. All waste was collected and saved for future reuse. The beads were 
resuspended in 0.7 mL of phosphate buffer, the filter was removed and the suspension 
was dispensed into a cuvette for fluorescence analysis. 

30 Following fluorescence analysis the bead suspension was collected into 

a 1 mL syringe (Becton Dickinson) which was then fitted with a filter tip. The 
phosphate buffer was dispensed and the waste collected. A solution of 
diphenyliodonium chloride (DPI) was drawn up with a concentration calculated to be 
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in 1 :1 molar equivalence to the theoretical amount of DNA template (i.e. DPI was 
present in excess of the incorporated fluorescein-labeled dCTP). The filter was 
removed and the bead suspension with added DPI was dispensed into a cuvette and 
exposed to UV light for 15 minutes. The suspension was recollected into a syringe, 
5 the filter reattached, the DPI solution was dispensed and the beads were resuspended 
in 0.7 mL of phosphate buffer. The bead suspension was dispensed into a clean 
cuvette for fluorescence analysis. 

It should be noted that a significant fraction of the beads used in this 
procedure appeared to become stuck in the filter on the syringe. This resulted in a 
10 significant increase in the pressure needed to force fluids through the filter as it 
became clogged by the beads, and more importantly reduced the amount of DNA 
available for fluorescent detection of incorporated nucleotides and reduced the weak 
rhodamine signal from the labeled primer to the point where it was no longer 
detectable. 



15 



20 



Following the successful incorporation reaction with dCTP, a 
subsequent polymerase reaction was run to incorporate dTTP. The incorporated 
fluorescein-labeled dTMP was detected, but with significantly lower intensity due to 
the losses of the beads in the filter in the multiple transfer steps between the reaction 
syringe and the analysis cuvette. The lowered signal could also result in part from a 
different labeling efficiency of the dTTP and a different incorporation efficiency for 
the labeled nucleotide in the polymerase reaction. Because the rhodamine signal was 
no longer detectable following the second incorporation reaction it was not possible to 
correct for bead losses. 

The results are shown in Figure 13. The data represented by the curves 
25 were obtained sequentially as follows : 

Curve 1 shows the rhodamine fluorescence following annealing of the 
rhodamine-labeled primer to the bead-immobilized DNA template 

Curve 2 demonstrates detection of the fluorescein-labeled dCTP 
following polymerase-catalyzed incorporation of three dCMP's onto the 3' terminus of 
30 the primer strands. 

Curve 3 demonstrates complete destruction of the incorporated 



omum 



fluorescein label on the dCMP's by photo-induced reaction with dipenyUodoni 
chloride. In this instance, the modamine label also has vanished; this is primarily 
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because a significant fiaction of the beads were lost by sticking in the filter used in the 
reagent flushing operation. It is possible that the rhodamine also was destroyed by the 
DPI photochemical reaction. 

Curve 4 demonstrates detection of a new fluorescein label following 
5 photochemical destruction of the fluorescein label on the dCMP's and polymerase- 
catalyzed incorporation of three fluorescein-tagged dTMP's onto the 3' terminus of the 
primer strands. The lower signal compared to curve 2 results mainly from the bead 
losses in the syringe, but may also reflect a lower incorporation efficiency of the 
dTMP and/or a lower labeling efficiency. Because the rhodamine signal from the 
10 labeled primer is no longer detectable, the bead losses cannot be calibrated. 

The results shown here demonstrate the concept of reactive sequencing 
by fluorescent detection of DNA extension followed by photochemical destruction of 
the fluorophore, which allows further extension and detection of a subsequent added 
fluorophore. This cycle can be repeated a large number of times if sample losses are 
avoided. In practical applications of this approach, such losses will be avoided by 
attaching the primer or template strands to the fixed surface of an array device, for 
example a microscope slide, and transferring the entire array device between a 
reaction vessel and the fluorescent reader. 

20 Example 8 

Read length is defined as the maximum length of DNA sequence that 
can be read before uncertainties in the identities of the DNA bases exceed some 
denned level. In the reactive sequencing approach, read length is limited by two types 
of polymerase failures: misincorporation, i.e., incorrectly incorporating a non- 
complementary base, and extension failure, i.e., failure to extend some fraction of the 
DNA primer strands on a given cycle in the presence of the correct complementary 
base. Example 2 demonstrated that reaction conditions can be optimized such that 
neither type of failure affects more than ~ 1% of the arrayed strands for any given 
incorporation reaction. Neither type of failure directly produces an error signal in the 
sequence readout, because neither a 1% positive signal, for a misincorporation, nor a 
1% decrease in the signal for a correct incorporation, in the case of extension failure, 
will be significant compared to the signals anticipated for a correct incorporation. 
However, accumulated failures limit the read length in a variety of different ways. 



25 



30 
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For example, misincorporation inhibits any further extension on the 
affected strand resulting in a reduction in subsequent signals. It is estimated that the 
probability of continuing to extend a given strand following a misincoiporation is no 
greater than 0.1%, so that any contribution to the fluorescent signal resulting from 
5 misincorporation followed by subsequent extension of the error strand will be 

negligible. Instead, the accumulation of misincorporations resulting in inhibition of 
strand extension ultimately reduces the overall signal amplitude for correct base 
incorporation to a level at which noise signals in the detection system begin to have a 
significant probability of producing a false signal that is read as a true base 
10 incorporation. 

Extension failures typically arise due to the kinetics of the extension 
reaction and limitations on the amount of time allotted for each extension trial with 
the single deoxynucleotide triphosphates (dNTP's). When reaction is terminated by 
flushing away the dNTP supply, a small fraction of the primer strands may remain 

15 unextended. These strands on subsequent dNTP reaction cycles will continue to 
extend but will be out of phase with the majority strands, giving rise to small 
out-of-phase signals, i.e., signaling a positive incorporation for an added dNTP which 
is incorrect for extension of the majority strands. Because extension failure can occur, 
statistically, on any extension event, the out-of-phase signals will increase as the 

20 population of strands with extension failures grows. If reaction conditions are chosen 
so that the reaction is 99,9% complete on a given reaction cycle, for example, after a 
further number, N, of successful extension reactions, N the out-of-phase signal will be 
approximately (1 - 0.999 N ). The number N at which the out-of-phase signal becomes 
large enough to be incorrectly read as a correct extension signal is the read length. For 

25 example, after extension by 200 bases with 99.9% completion, the out-of-phase signal 
is approximately 18% of the in-phase signal, for a single base extension in either case. 
After extension by 400 bases the out-of-phase signal grows to 33%. The point at 
which the read must terminate is dictated by the ability to distinguish the in-phase 
signals from the out-of-phase signals. 

30 In what follows, a length of single base repeats, e.g. AAAAA, is 

treated as a single base for the purposes of discussing the phase difference between 
strands. If the reaction cycle of the four dNTP's is unchanged, then a primer strand 
which has failed to extend when the correct dNTP, for example dATP, is in the 
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reaction cell cannot trail the leading, i.e., majority strands, which did extend correctly, 
by more than 3 bases because the fourth base in the dNTP reaction cycle will always 
once again be the correct base (dATP) for the strand which failed to extend 
previously. It is assumed that extension failure is purely statistical, and that any strand 
5 which fails to extend has an equal chance of subsequent extension when the correct 
dNTP is supplied, and that this extension probability is sufficiently high that the 
chance of repeated extension failures on the same strand is vanishingly small. For 
example, if the probability of extension failure on a single strand is 0.1%, the 
probability of two extension failures on the same strand is (0.001) 2 or 10" 6 . Similarly, 

10 the trailing strand can never resynchronize with the leading strands if extension 
subsequently proceeds correctly, because the leading strands will always have 
• extended by at least one more nucleotide - G, T, or C in the example discussion of an 
A extension failure - before the trailing strand can add the missing A. The effect is 
that after each complete dNTP cycle the trailing strands always follow the leading 

15 strands by an extension amount that represents the bases added in one complete dNTP 
cycle at a given point in the sequence. These observations predict that: (i) the gap 
between the leading and trailing strands perpetually oscillates between 1 and 3 bases 
and can never increase unless a second extension failure occurs on the same strand; 
and (ii) the gap between the leading and trailing strands is independent of the position 

20 along the trailing strand at which the extension failure occurs. This gap at any given 
point in the extension of the leading strands is solely a function of the sequence of the 
leading strand population up to that point and the dNTP probe cycle. In other words, a 
population of trailing strands is produced due to random extension failure at different 
points in the sequence, but these trailing strands themselves are all exactly in phase 

25 with each other. 

Because the result of an extension failure is to produce a trailing strand 
population that trails the leading strands perpetually by an amount that oscillates 
between one and three nucleotides, assuming that a second extension failure does not 
occur on the trailing strand and that the probing dNTP cycle remains unchanged, 
30 therefore the gap between the leading and trailing strand populations can always be 
known by tracking the leading strand sequence by, for example, computer simulation 
and simulating an extension failure event at any point along the sequence. 
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Thus the present invention provides, first, a general method of 
computer tracking of the sequence information which allows the out-of-phase error 
signals due to extension of trailing strands to be recognized and subtracted from the 
correct signals, and, second, methods of altering the probing dNTP cycle to 

5 selectively extend the trailing strands so that they move back into phase with the 
leading strands, thus completely eliminating sequence uncertainty due to out-of-phase 
signals arising from the trailing strands that result from extension failure. 

The statistics which govern the ability to distinguish an incorrect signal 
from out-of-phase strands from a correct signal depend upon the noise level and 

10 statistical variation of the fluorescence signal. Assuming that the signal for a correct 
1-base extension has a standard deviation of ±5%, then statistically 99.75% of the 
signals will have an amplitude between 0.85 and 1.15 (=fc 3 standard deviations from 
the average value) when the average value is 1.0 and the standard deviation is 0.05. If 
the extension signal must be at least 85% of the average single extension signal to 

15 register a correct extension, then statistically a correct extension will be missed only 
0.13% of the time, i.e. the readout accuracy would be 99.87%. Another 0.13% of the 
signals for a correct extension will be greater than 1.15, but the concern is only with 
signals that are lower than average and so are more difficult to distinguish from a 
growing signal from out-of-phase strands. The statistics for errors arising from 

20 out-of-phase extension of a trailing strand are similar. If the standard deviation of the 
trailing strand signals is also ±5% of the mean extension signal which will be true 
whenever the trailing strand intensity approaches the leading strand intensity, then if 
the trailing strand intensity does not grow beyond 0.7, the fraction of trailing strand 
extensions that give rise to a signal of 0.85 or greater 4 standard deviations beyond the 

25 mean is less than 0.01%. Thus an out-of phase signal arising from a single-base 
extension on one of the three sets of trailing strands should be distinguishable from 
the in-phase signal with - 99.87% accuracy so long as the out-of-phase signal does not 
grow beyond - 70% of the in-phase signal. 

The above discussion assumes that all the extension events correspond 

30 to single base extensions. However, multiple single-base repeats are common in DNA 
sequences, thus one must consider the situation where the out-of-phase signal can be 
M times larger than that for a single base extension, where M is the repeat number. 
For example, if the population of one of the three sets of out-of-phase strands has 
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grown to 20% of the leading strand population, at which level the in-phase and 
out-of-phase signals can readily be distinguished for a single base extension, then if 
this set of out-of-phase strands encounters a 5-base repeat, e.g. AAAAA, the signal 
for that repeat becomes identical in magnitude to that for a single base extension on 
5 the in-phase strands. Real-time computer monitoring of the extension signals permits 
discrimination against such repeat-enhanced out-of-phase signals, for example, by 
implementing linear and/or nonlinear auto-regressive moving average (ARMA) 
schemes. The essential points here are as follows (i) the out-of-phase strands are those 
that are trailing the majority strands as a result of extension failure; misincorporation 
10 events which could produce leading error strands have the effect of shutting down 
further extension on the affected strands and so do not give rise to significant out-of- 
phase error signals; (ii) there is always only one population of trailing strands 
regardless of where the extension failure occurred; all the primer strands in this 
population have been extended to the same point which trails the leading strand 
15 sequence by 1 , 2 or 3 bases; and (iii) because the leading strands have always 

previously traversed the sequence subsequently encountered by the trailing strands, 
the sequence at least 1 base beyond the 3' terminus of the trailing strands is always 
known and allows prediction of exactly whether, and by how much, these trailing 
strands will extend for any nucleotide supplied, by simulating, in a computer for 
20 example, the effect of an extension failure at any point in the known sequence 
upstream of the position to which the leading strands have advanced. 

On each incorporation trial, in addition to any possible correct 
extension signal for the leading strands, there may also be an error signal 
corresponding to extension of the trailing strands. For example, let us assume that the 
25 trailing strand population has grown as large as 20% of the leading strand population. 
The size of this population can be monitored by detecting the incorporation signal 
when the trailing strands extend and the leading strands do not. Assume that the 
leading strand population has just traversed a single base repeat region on the 
template, for example AAAAA, and incorporated onto the primer the complementary 
30 T repeat: TTTTT. The trailing strands will not traverse this same AAAAA repeat for 
at least a complete cycle of the four probing nucleotides, until the next time the 
strands are probed with dTTP. Knowing the size of the trailing strand population from 
the amplitude of its incorporation signals, determined at any point where the leading 
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strands do not extend but the trailing strands do, the signal to be expected from the 
trailing strand population due to the TTTTT incorporation can be calculated precisely. 
If the trailing strand population is 1/5 as large as the leading strand population, for 
example, this signal will mimic incorporation of a single T on the leading strand 

5 population. In the absence of the computer-aided monitoring method discussed here, 
such a false signal would give rise to a drastic sequence error. 

Figures 14A and 14B demonstrate how data would appear for a 
sequence: [CTGAJ GAA ACC AGA AAG TCC [T], probed with a dNTP cycle: 
CAGT, close to the primer where no extension failure has occurred (Figure 14 A) and 

10 well downstream (Figure 14B) at a point where 60% of the strands have undergone 
extension failure and are producing out-of-phase signals, and misincorporation has 
shut down extension on 75% of all strands. The readouts shown start at the second G 
in the sequence (beyond the [CTGA] sequence in parentheses) and end at the last C 
(before the [T] in parentheses). The digital nature of the signal in Figure 14A and 

15 also the amplitude scale should be noted. In Figure 14B, the signal for a single base 
extension has been reduced by 60%, from 1 .0 to 0.4 due to the extension failure 
strands, and by a further factor of 4 to 0.1 due to misincorporation and the resulting 
75% signal loss. However, added to the correct extension signals are signals due to 
the out-of-phase extension of the trailing strands. At first sight, the readout is 

20 completely different from the correct readout shown in Figure 14A, due to the 

superposition of signals produced when the trailing strands encounter the sequence 
previously traversed by the leading strands. Particularly large errors arise whenever 
the trailing strand population encounters the AAA repeats. For example, the second T 
probe yields a signal amplitude corresponding to an AAAAA repeat instead of the 

25 correct single A, the third G probe gives a signal corresponding to CCC when in fact 
there is no C at this point in the leading strand sequence, the fourth T probe reads 4 
A's when the correct sequence has none (the trailing strands encounter the second 
AAA repeat). However, because the sequence from the leading strands is known, the 
false signals arising from the trailing strands can be predicted and subtracted from the 

30 total signal to obtain the correct sequence readout This is shown in Figure 14C, 

where the signals arising from the trailing strands are coded by different shading from 
the leading strand signal. Because the signals due to the trailing strands can be 
predicted, the error signals can be subtracted to obtain the correct digital sequence 
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readout shown in Figure 14D. It should be noted that the data in Figure 14D are now 
identical to those in Figure 14A, and yield the correct sequence readout for the 
leading strands, the only difference being that the overall intensity is reduced due to 
the assumed loss of signal due to misincorporation and extension failure, the latter 
5 populating the trailing strands. In other words, by keeping track of the sequence in a 
computer the effect is as though one could directly visualize the different 
contributions as depicted on the plot in Figure 14C. Therefore, it is possible to predict 
for any probe nucleotide event exactly what the signal from the trailing strand 
population should be, and subtract this error signal from the measured signal to arrive 
10 at a true digital signal representative of the sequence of the leading strand population, 
which is the desired result. 

Given the ability to compute and subtract any trailing strand signals as 
discussed, the accuracy with which nucleotide incorporation or non-incorporation on 
the leading strands can be sensed is limited, not by the absolute size of the trailing 
15 strand signal, but instead by the noise on those signals. For example, assume that the 
signal for a single-base extension of a trailing strand population equal to 20% of the 
leading strand population is 0.2 ± 0.05. If the trailing strands encounter a 5-base 
repeat, the resulting signal would be identical in amplitude to that produced by a 
single-base extension of the leading strands, but this signal could be subtracted from 
20 the observed signal to yield either a signal resulting from a leading strand extension, 
or a null signal corresponding to no extension of the lading strands. Assuming that 
the noise is purely statistical and therefore is reduced in proportion to the square root 
of the signal amplitude, for a 5-base extension of the trailing strands or a single 
extension of the leading strands the signal would be 1 ± (0.05 x V5), i.e. 1 ± 0.11, 
25 because the statistical noise on a set of added signals grows as the square root of the 
number of signals. One can subtract from this value a correction signal which is much 
more accurately known because the trailing strand signal has been repeatedly 
measured yielding better statistics on this value. It is assumed that the uncertainty in 
the correction signal is negligible. For no extension of the leading strands, the 
30 resulting difference signal would be 0 ± 0. 1 1 , whereas a single extension of the 
leading strands would yield a difference signal of 1 ± 0.1 1 ; the two signals are 
distinguishable with better than 99.9% accuracy. 
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The example given here is an extreme case: in feet, the extension 
failure can be corrected at any point, so that it will be possible to minimize the trailing 
strand population below a level where it would produce signals that make the leading 
strand sequence uncertain. 

5 There are additional advantages to the computer-aided monitoring 

method proposed. First, the signals from the trailing strands serve as an additional 
check on the leading strand sequence. Second, the trailing strand population could be 
allowed to surpass the leading strand population in magnitude. Without computer- 
aided monitoring, readout would have to cease well before this point, however, with 

10 computer-aided monitoring, readout can continue, now using the trailing strands 

rather than the leading strands to reveal the sequence. Thus, the strand population that 
trails due to only one extension failure now becomes the leading strand population for 
the purposes of computer aided monitoring. This allows readout to continue until 
further complications arise from the occurrence of 2 extension failures on the same 

15 strand, producing a new trailing strand population which can be tracked in the same 
way as the single failure strands, while the population of strands that have undergone 
no error failure diminishes to the point where it contributes no detectable signal. 

Optimization of reagents, enzyme and reaction conditions should allow 
misincoiporation probabilities below 1%, and extension failure probabilities as low as 

20 0. 1%. The computer aided monitoring method of the present invention additionally 
provides a means for healing the trailing strand population by selectively extending 
this population so that it is again synchronous with the leading strands. For example, 
given a dNTP probe cycle of GCTA, and a template sequence (beyond the 3 f end of 
the primer) of: 

25 GTGCAGATCTG . . . 

and assuming that when dCTP is in the reaction chamber, the polymerase fails to 
incorporate a C in some fraction of the primer strands, the following results: 

Template GTG CAG ATC TG ... 

Main strands C 

30 Template GTG CAG ATC TG 

Failure strands 
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At the end of the first cycle, the main strands have extended by ....CA, 
while the failure strand has not advanced. After one more complete cycle, the main 
strand extension is ....CAC and the failure strand now reads ....CA, i.e. now just one 
base out of phase. 

5 Template GTGCAG ATC TG ... 

Main strands CAC 

Template GTG CAG ATC TG . . . 

Failure strands CA 



Because the phase lag arises from the repeating interaction of the probe cycle 

10 sequence with the template sequence, the unchanged probe cycle can never have the 
correct sequence to resynchronize the strands. Instead, if the probe cycle is 
unchanged, and if no further extension failures occur, the phase lag for a given failure 
strand oscillates perpetually between 1 and 3 bases, counting single base repeats as 
one base for this purpose. However because the leading strand sequence up to the last 

15 extension is always known, one can determine the effect of introducing an extension 
failure at some upstream position. It should be noted that an extension failure 
introduced at any arbitrary upstream position, or any base type, always produces the 
same phase lag because the effect of an extension failure is to cause extension of the 
affected strand to lag by one complete dNTP cycle. Thus, it is possible to alter the 

20 probe cycle sequence, for example to probe with a C, instead of a G, after the last A in 
the sequence discussed above. The failure strand would advance while the main 
strands did not and the phase lag would heal. In yet another embodiment the dNTP 
probe cycle may be reversed whenever the phase lag shrinks to only 1 base. 
Whenever the phase difference declines to a single base, or repeats of a single base, 

25 then simply reversing the probe cycle sequence always resynchronizes the strands. 

Figure 15 shows how a leading strand population arising from 
incorrect extension of a fraction of primer strands due to nucleotide impurities can 
adversely affect the signals from the main population. Using the same template 
sequence as before: 
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[CTGA] GAA ACC AGA AA GTC C [TC AGT] and the same probe cycle: CAGT, 
the effect of a leading strand population which is 20% of the main strand population 
can be simulated and 2 bases ahead of the main strands at the time the main strand 
sequence begins to be read. The leading strands have already extended by -C TTT. 
5 The first C probe extends the main primer strands by one base complementary to the 
first G in the sequence giving a single base extension signal of 1 . The first G extends 
the leading strands by -GG- complementary to the -CC- repeat, giving a signal of OA 
Greater ambiguity arises when the leading strands encounter the second -AAA-repeat 
at the second T probe, increasing the main strand signal from the correct value for a 

10 single base extension to 1.6. In the absence of further information, this value will be 
ambiguous or may be interpreted as a 2-base repeat. 

Correction for these ambiguities comes from the fact that the correct 
sequence of the main strands is read following the leading strand read. In general, a 
large multiple repeat which can give an error signal when encountered by the leading 

1 5 strands will subsequently give a larger signal when encountered by the main strands, 
and superimposed on this correct signal will be a leading strand signal for which there 
are three possibilities: (i) zero signal: the leading strands do not extend; (ii) small 
signal that does not create ambiguity -the leading strands extend by a single base or a 
repeat number small enough not to simulate an additional base extension of the main 

20 strands; (iii) large signal; the leading strands encounter a second large repeat. By 
monitoring the main strand sequence, it is possible at each extension to retroactively 
estimate the effects of a leading strand population and subtract such signals from the 
main strand signals to arrive at a correct sequence. In the case where the leading 
strands encounter a repeat large enough to create ambiguity in the sequence, even if 

25 the leading strands subsequently encounter a second or third large repeat when the 
main strands encounter the first repeat, the main strands will eventually traverse the 
same region to give sufficient information to derive the correct sequence. In other 
words, the sequence information at any point is always overdetermined - the signal for 
any given extension is always read twice, by the leading strands and the main strands, 

30 and so yields sufficient information to determine both the correct sequence and the 
fractional population of the leading strands, which are the two pieces of information 
required 
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Because the sequence of the leading strand population produced by 
impure nucleotides cannot be known until it is subsequently traversed by the main 
strands, one cannot know what dNTP probe cycle would act to extend the main 
strands while not extending the leading strands, as was the case for a trailing strand 
5 population produced by extension failure. However, as with trailing strands, the gap 
between the leading and main strands oscillates perpetually between one and three 
bases, and can be reconnected by reversing the dNTP probe sequence whenever the 
gap between the leading and main strands shrinks to a single base. Although it cannot 
be known when this single base gap occurs, the dNTP probe sequence can be reversed 
10 at regular intervals. Trials indicate that such a process ultimately reconnects 

approximately 2/3 of the leading strands. The statistics for this process are as follows. 

Statistically, because the gap between the main and leading strands can 
* be 1, 2 or 3 bases, there is a 1/3 probability that the leading strand population will 
have only a 1-base phase lag at any time the cycle is reversed. The 1-base phase 
15 difference will always be healed by a cycle reversal. Another 1/3 of the time the 
leading strands are 2 bases ahead at the time the cycle is reversed. For the next 
probing base the following possibilities exist: 

Lead Main 
strand strand 



20 0 



No extension on either strand: Prob 3/4 x 3/4 = 9/16 



+1 0 Phase lag increases: Prob 1/4 x 3/4 = 3/12 

+1 + 1 Both strands advance: Prob. 1/4 x 1/4 = 1/16 

0 +1 phase lag decreases: Prob. 3/4 x 1/4 = 3/12 

Phase lag stays at 2: Number of chances = 10/16 

25 Phase lag decreases Number of chances = 3/12 

Phase lag increases Number of chances = 3/1 2 



So the chance of making a 2-base gap worse is (3/12)/(10/16 + 3/12) = 28% 
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Considering all three gap sizes: 

1- base gap heals (33% of population) 

2- base gap gets .worse 28% of the time: only 1/3 of gaps are 2 base, so 9% total get 
worse 

5 3 base gap also gets worse 28% of the tune, again 9% overall effect 

In sum, 33% heal at a given reversal, 18% lose at a given reversal and the remaining 
50% are unchanged. Even assuming the 18% are permanently lost (and a 2 base gap 
increased to a 3 base gap can still rejoin), at each subsequent reversall/3 of the 50% 
of strands are healed, which are unchanged by the previous reversal, as follows: 

10 Reversal # Fraction of gaps healed 

1 33% 

2 17% 

3 9% 

4 4.5 % 
15 5 2.5 % 

6 1% 

Total ~67% 

Therefore, repeated reversal of the dNTP probe cycle can reduce by 2/3 the effects of 
out-of-phase signals due to incorrect extension by nucleotide impurities, or random 
20 extension failure, effectively increasing the read length when limited by either effect 
by a factor of 3. 

Although the invention has been described herein with reference to 
specific embodiments, many modifications and variations therein will readily occur to 
those skilled in the art. Accordingly, all such variations and modifications are 
25 included within the intended scope of the invention. 
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In the Claims! 

1. A method of DNA sequencing comprising the steps of: 

(a) providing a template system comprising at least one nucleic 
acid molecule of unknown sequence hybridized to a primer 
oligonucleotide in the presence of a DNA polymerase with 
reduced exonuclease activity; 

(b) contacting the template system with a single type of 
deoxyribonucleotide under conditions which allow extension of 
the primer by incorporation of at least one deoxyribonucleotide 
to the 3' end of the primer to form an extended primer; 

(c) detecting whether extension of the primer has occurred; 

(d) detecting the number of deoxyribonucleotides incorporated 
into the primer; 

(e) removing unincorporated deoxyribonucleotide; and 

(f) repeating steps (a) through (e) to determine the nucleotide 
sequence of the nucleic acid molecule. 

2. The method of Claim 1 wherein the at least one deoxyribonucleotide 
includes a chemiluminescent moiety comprising detecting whether extension of the 
primer has occurred by detecting a chemiluminescent signal emitted by the 
chemiluminescent moiety, and further comprising removing the ch emilumin escent 
moiety from the template system. 

3. The method of Claim 1 wherein the at least one deoxyribonucleotide 
includes a fluorescent moiety comprising detecting whether extension of the primer 
has occurred by detecting a fluorescent signal emitted by the fluorescent moiety, and 
further comprising removing the fluorescent moiety from the template system. 
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4. The method of Claim 1 wherein the at least one deoxyribonucleotide 
includes a fluorescent moiety comprising detecting whether extension of the primer 
has occurred by detecting a fluorescent signal emitted by the fluorescent moiety, and 
further comprising destroying the fluorescent signal without removal of the 
fluorescent moiety. 

5. The method of claim 4 wherein the fluorescent moiety is destroyed 
by reaction with compounds capable of extracting an electron from the excited state 
of the fluorescent moiety. 

6. The method of claim 5 wherein the compound is a 
diphenyliodonium salt. 

7. The method of Claim 1 comprising detecting whether extension of 
the primer has occurred by detecting a change in the concentration of unincorporated 
deoxyribonucleotide. 

8. The method of Claim 1, wherein incorporation of the at least one 
deoxyribonucleotide generates heat, comprising detecting whether extension of the 
primer has occurred by detecting the heat generated by said incorporation. 

9. The method of Claim 8 wherein a thermopile is used to detect the 

generated heat. 

10. The method of Claim 8 wherein a thermistor is used to detect the 

generated heat 

1 1 . The method of Claim 1 wherein the template system further 
includes a buffer wherein incorporation of the at least one deoxyribonucleotide 
generates heat which is absorbed by said buffer and further comprising measuring the 
refractive index of the buffer. 
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12. The method of Claim 1 comprising detecting whether extension of 
the primer has occurred by detecting the concentration of pyrophosphate released by 
addition of a deoxyribonucleotide to the 3 f end of the primer. 

13. The method of Claim 12 wherein the concentration of 
pyrophosphate is detected by hydrolyzing the pyrophosphate and measuring heat 
generated by hydrolysis of the pyrophosphate. 

14. The method of Claim 1 wherein the DNA polymerase is a T4 DNA 
polymerase with a substitution of amino acid residue Aspl 12 by Ala and Glul 14 by 
Ala. 

15. The method of Claim 1 1 wherein the DNA polymerase further 
comprises a T4 DNA polymerase with a substitution of amino acid residue Ile417 by 
Val. 

16. A method of DNA sequencing comprising the steps of: 

(a) providing a template system comprising at least one nucleic 
acid molecule of unknown sequence hybridized to a primer 
oligonucleotide in the presence of a exonuclease deficient DNA 
polymerase; 

(b) contacting the template system with a single type of 
deoxyribonucleotide under conditions which allow extension of 
the primer by incorporation of at least one deoxyribonucleotide 
to the V end of the primer to form an extended primer, 

(c) detecting whether extension of the primer has occurred 
thereby identifying the deoxyribonucleotide added to the 3' end 
of the primer; 

(d) detecting the number of deoxyribonucleotides incorporated 
into the prima-; 
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(e) removing unincorporated deoxyribonucleotide; 

(f) contacting the template system with a mixture including an 
exonuclease proficient DNA polymerase, an exonuclease 
deficient DNA polymerase and the identified 
deoxyribonucleotide of step (b); 

(g) removing the mixture of step (f); and 

(h) repeating steps (a) through (g) to determine the nucleotide 
sequence of the nucleic acid molecule. 

17. The method of Claim 16 wherein the at least one 
deoxyribonucleotide includes a flourescent moiety comprising detecting whether 
extension of the primer has occurred by detecting a fluorescent signal emitted by the 
fluorescent moiety, 

18. The method of Claim 16 wherein the at least one 
deoxyribonucleotide includes a fluorescent moiety comprising detecting whether 
extension of the primer has occurred by detecting a fluorescent signal emitted by the 
fluorescent moiety, and further comprising destroying the fluorescent signal without 
removal of the fluorescent moiety. 

19. The method of claim 18 wherein the fluorescent moiety is 
destroyed by reaction with compounds capable of extracting an electron from the 
excited state of the fluorescent moiety. 

20. The method of claim 19 wherein the compound is a 
diphenyUodonium salt. 

21. The method of Claim 16 wherein the at least one 
deoxyrilonucleotide includes a chemiluminescent moiety comprising detecting 
whether extension of the primer has occurred by detecting chemiluminescent signal 
emitted by the chemiluminescent moiety. 
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22. The method of Claim 16 comprising detecting whether extension 
of the primer has occurred by detecting a change in the concentration of 
unincorporated deoxyribonucleotide, 

23. The method of Claim 16 wherein incorporation of the at least one 
deoxyribonucleotide generates heat comprising detecting whether extension of the 
primer has occurred by detecting heat generated by said incorporation. 

24. The method of Claim 23 wherein a thermopile is used to detect the 

generated heat. 

25. The method of Claim 23 wherein a thermistor is used to detect the 

generated heat. 

26. The method of claim 16 wherein the template system further 
includes a buffer wherein incorporation of the at least one deoxyribonucleotide 
generates heat which is absorbed by said buffer and further comprising measuring the 
refiactive index of the buffer. 

27. The method of Claim 16 comprising detecting whether extension 
of the primer has occurred by detecting the concentration of pyrophosphate released 
by incorporation of a deoxyribonucleotide to the 3' end of the primer. 

28. The method of Claim 27 wherein the concentration of 
pyrophosphate is detected by hydrolyzing the pyrophosphate and measuring the heat 
generated by hydrolysis of the pyrophosphate. 

29. The method of Claim 16 wherein the exonuclease deficient DNA 
polymerase is a T4 DNA polymerase with a substitution of amino acid residue 
Aspl 12 by Ala and Glul 14 by Ala, 

30. The method of Claim 26 wherein the exonuclease deficient DNA 
polymerase further comprises a T4 DNA polymerase with a substitution of amino acid 
residue Ile417by Val. 
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3 1 , A method for removal of contaminating nucleotides from a 
solution comprising contacting said solution with immobilized DNA complementary 
to each of the three possibly contaminating nucleotides in the presence of primers and 
polymerase for a time sufficient to incorporate any contaminating nucleotides into 
DNA. 

32 A method for discriminating between the in-phase and out-of-phase 
sequencing signals comprising: 

(i) detecting and measuring error signals thereby 
determining the size of the trailing strand population; 

(ii) between the 3 1 terminus of the trailing strand primers 
and the 3' terminus of the leading strand primers; 

(iii) simulating the occurrence of an extension failure at a 
point upstream from the 3* terminus of the leading 
strands thereby predicting at each extension step the 
exact point in the sequence previously traversed by the 
leading strands to which the 3' termini of the trailing 
strands have been extended 

(iv) predicting for each dNTP introduced the signal to be 

■ 

expected from correct extension of the trailing strands; 
and 

(v) subtracting the predicted signal from the measured 
signal to yield a signal due only to correct extension of 
the leading strand population. 
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